Skip to content

[SYCL][Host Task] Optimize blocked users tracking to prevent execution time explosion for long dependency chains #18501

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 2 commits into
base: sycl
Choose a base branch
from

Conversation

Nuullll
Copy link
Contributor

@Nuullll Nuullll commented May 16, 2025

This commit addresses a performance issue observed when submitting consecutive host tasks to an in-order queue without explicit wait(). The execution time of each host task was found to increase significantly as the number of submissions grew:
#18500.

The root cause was identified as the unnecessary tracking of indirect blocking dependencies in MBlockedUsers. Previously, all direct and indirect blocking relations between enqueued commands were tracked, causing a siginificant increase in notification time upon task completion. For example, in a sequence of tasks A, B, C, D, A.MBlockedUsers would redundantly include {C, D}, even though these tasks are already blocked by B.

To resolve this, the enqueueCommand function in the Scheduler was enhanced to include a RecursionDepth parameter. This change prevents excessive growth in the size of Cmd->MBlockedUsers in long dependency chains by tracking only direct blocking dependencies, thereby reducing notification time upon command completion.

…n time explosion for long dependency chains

This commit addresses a performance issue observed when submitting
consecutive host tasks to an in-order queue without explicit `wait()`.
The execution time of each host task was found to increase significantly
as the number of submissions grew:
intel#18500.

The root cause was identified as the unnecessary tracking of indirect
blocking dependencies in `MBlockedUsers`. Previously, all direct and
indirect blocking relations between enqueued commands were tracked,
causing a siginificant increase in notification time upon task
completion. For example, in a sequence of tasks `A, B, C, D`,
`A.MBlockedUsers` would redundantly include `{C, D}`, even though these
tasks are already blocked by `B`.

To resolve this, the `enqueueCommand` function in the Scheduler was
enhanced to include a `RecursionDepth` parameter. This change prevents
excessive growth in the size of `Cmd->MBlockedUsers` in long dependency
chains by tracking only direct blocking dependencies, thereby reduction
notification time upon command completion.
@KseniyaTikhomirova
Copy link
Contributor

KseniyaTikhomirova commented May 16, 2025

@Nuullll hi, it seems incorrect. The reason why we track non direct blocked users is that host task enqueues blocked users on its completion and if we have:
HT1
K2 depending on HT1
K3 depending on K2 (and implicitly depending on HT1)

then first enqueue of K2 and K3 is failed if HT1 is not completed.
In this case if we enqueue only direct dependency K2 on host task completion - there is nobody to enqueue K3.

@sarnex
Copy link
Contributor

sarnex commented May 16, 2025

This PR seems to be causing build hangs, please fix before rerunning :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants