[SYCL][Host Task] Optimize blocked users tracking to prevent execution time explosion for long dependency chains #18501
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This commit addresses a performance issue observed when submitting consecutive host tasks to an in-order queue without explicit
wait()
. The execution time of each host task was found to increase significantly as the number of submissions grew:#18500.
The root cause was identified as the unnecessary tracking of indirect blocking dependencies in
MBlockedUsers
. Previously, all direct and indirect blocking relations between enqueued commands were tracked, causing a siginificant increase in notification time upon task completion. For example, in a sequence of tasksA, B, C, D
,A.MBlockedUsers
would redundantly include{C, D}
, even though these tasks are already blocked byB
.To resolve this, the
enqueueCommand
function in the Scheduler was enhanced to include aRecursionDepth
parameter. This change prevents excessive growth in the size ofCmd->MBlockedUsers
in long dependency chains by tracking only direct blocking dependencies, thereby reducing notification time upon command completion.