Skip to content

[Bugfix] Fix triton import with local TritonPlaceholder #17446

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
May 6, 2025

Conversation

MengqingCao
Copy link
Contributor

@MengqingCao MengqingCao commented Apr 30, 2025

Fix triton import error in non-triton platforms with the local TritonPlaceholder

Copy link

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

🚀

Copy link

mergify bot commented May 2, 2025

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @MengqingCao.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

@mergify mergify bot added the needs-rebase label May 2, 2025
@MengqingCao MengqingCao force-pushed the fix_triton_placeholder branch from 229f545 to 67de861 Compare May 2, 2025 09:27
@mergify mergify bot removed the needs-rebase label May 2, 2025
@MengqingCao MengqingCao changed the title [Bugfix] Fix TritonPlaceholder conflicts with torch.compile [Bugfix] Fix triton import with local TritonPlaceholder May 2, 2025
@MengqingCao MengqingCao force-pushed the fix_triton_placeholder branch 4 times, most recently from f89bea8 to 46aea41 Compare May 2, 2025 10:03
@MengqingCao
Copy link
Contributor Author

Plz help to review this pr, thanks! @Isotr0py @zou3519 @houseroad

Yikun added a commit to vllm-project/vllm-ascend that referenced this pull request May 5, 2025
### What this PR does / why we need it?
Re-patch TritonPlaceholder on main to make CI happy
- Add triton patch back until
vllm-project/vllm#17446 resolved
- Move patch_main before patch_common to resolve minicpm triton import
issue
- Add `0.8.5` and `0.8.5.post1` to make patch work on 0.8.5 all versions

Related:
- #704
- #690

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
All CI passed include main

Signed-off-by: Yikun Jiang <[email protected]>
Copy link
Collaborator

@houseroad houseroad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes make sense to me. Left one inline comment.

@@ -1,5 +1,13 @@
# SPDX-License-Identifier: Apache-2.0

from vllm.triton_utils.importing import HAS_TRITON
from vllm.triton_utils.importing import (HAS_TRITON, TritonLanguagePlaceholder,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we add some unittest for this placeholder logic?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I'll add it soon

@houseroad houseroad added the ready ONLY add when PR is ready to merge/full CI is needed label May 6, 2025
Signed-off-by: Mengqing Cao <[email protected]>
Signed-off-by: Mengqing Cao <[email protected]>
@Isotr0py Isotr0py enabled auto-merge (squash) May 6, 2025 05:59
@MengqingCao
Copy link
Contributor Author

MengqingCao commented May 6, 2025

@youkaichao @Isotr0py Sorry for bothering you, CI failed due to unrelated uts (tested locally without this pr, and same timeout error raised)

could we fix them in next pr and merge this first?

@youkaichao youkaichao disabled auto-merge May 6, 2025 09:53
@youkaichao youkaichao merged commit f9bc5a0 into vllm-project:main May 6, 2025
63 of 66 checks passed
@mgoin
Copy link
Member

mgoin commented May 6, 2025

Could we add a pre-commit or GHA check that there are no new import triton being added in PRs?

Yikun added a commit to vllm-project/vllm-ascend that referenced this pull request May 6, 2025
### What this PR does / why we need it?
- Revert "Re-patch TritonPlaceholder on main to make CI happy (#753)"
because upstream main CI already merged:
vllm-project/vllm#17446
- Keep 0.8.5.post1 compatible

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
CI passed

---------

Signed-off-by: Yikun Jiang <[email protected]>
@MengqingCao MengqingCao deleted the fix_triton_placeholder branch May 6, 2025 11:05
@MengqingCao
Copy link
Contributor Author

Could we add a pre-commit or GHA check that there are no new import triton being added in PRs?

Thanks for this good catch, I'll add it soon.

robertgshaw2-redhat added a commit to neuralmagic/vllm that referenced this pull request May 6, 2025
* [Model] Add GraniteMoeHybrid 4.0 model (vllm-project#17497)

Signed-off-by: Thomas Ortner <[email protected]>
Signed-off-by: Stanislaw Wozniak <[email protected]>
Co-authored-by: Thomas Ortner <[email protected]>
Co-authored-by: Cyrus Leung <[email protected]>
Co-authored-by: Tyler Michael Smith <[email protected]>

* [easy] Fix logspam on PiecewiseBackend errors (vllm-project#17138)

Signed-off-by: rzou <[email protected]>

* [Bugfix] Fixed prompt length for random dataset (vllm-project#17408)

Signed-off-by: Mikhail Podvitskii <[email protected]>

* [Doc] Update notes for H2O-VL and Gemma3 (vllm-project#17219)

Signed-off-by: DarkLight1337 <[email protected]>

* [Misc] Fix ScalarType float4 naming  (vllm-project#17690)

Signed-off-by: Lucas Wilkinson <[email protected]>

* Fix `dockerfilegraph` pre-commit hook (vllm-project#17698)

Signed-off-by: Harry Mellor <[email protected]>

* [Bugfix] Fix triton import with local TritonPlaceholder (vllm-project#17446)

Signed-off-by: Mengqing Cao <[email protected]>

* [V1] Enable TPU V1 backend by default (vllm-project#17673)

Signed-off-by: mgoin <[email protected]>

* [V1][PP] Support PP for MultiprocExecutor (vllm-project#14219)

Signed-off-by: jiang1.li <[email protected]>
Signed-off-by: jiang.li <[email protected]>

* [v1] AttentionMetadata for each layer (vllm-project#17394)

Signed-off-by: Chen Zhang <[email protected]>

* [Feat] Add deprecated=True to CLI args (vllm-project#17426)

Signed-off-by: Aaron Pham <[email protected]>

* [Docs] Use gh-file to add links to tool_calling.md (vllm-project#17709)

Signed-off-by: windsonsea <[email protected]>

* [v1] Introduce KVCacheBlocks as interface between Scheduler and KVCacheManager (vllm-project#17479)

Signed-off-by: Chen Zhang <[email protected]>

* [doc] Add RAG Integration example (vllm-project#17692)

Signed-off-by: reidliu41 <[email protected]>
Co-authored-by: reidliu41 <[email protected]>

* [Bugfix] Fix modality limits in vision language example (vllm-project#17721)

Signed-off-by: DarkLight1337 <[email protected]>

* Make right sidebar more readable in "Supported Models" (vllm-project#17723)

Signed-off-by: Harry Mellor <[email protected]>

* [TPU] Increase block size and reset block shapes (vllm-project#16458)

* [Misc] Add Next Edit Prediction (NEP) datasets support in `benchmark_serving.py` (vllm-project#16839)

Signed-off-by: dtransposed <damian@damian-ml-machine.europe-west3-b.c.jetbrains-grazie.internal>
Signed-off-by: dtransposed <>
Co-authored-by: dtransposed <damian@damian-ml-machine.europe-west3-b.c.jetbrains-grazie.internal>

* [Bugfix] Fix for the condition to accept empty encoder inputs for mllama (vllm-project#17732)

Signed-off-by: Gregory Shtrasberg <[email protected]>

* [Kernel] Unified Triton kernel that doesn't distinguish between prefill + decode (vllm-project#16828)

Signed-off-by: Thomas Parnell <[email protected]>
Signed-off-by: Lucas Wilkinson <[email protected]>
Co-authored-by: Lucas Wilkinson <[email protected]>

---------

Signed-off-by: Thomas Ortner <[email protected]>
Signed-off-by: Stanislaw Wozniak <[email protected]>
Signed-off-by: rzou <[email protected]>
Signed-off-by: Mikhail Podvitskii <[email protected]>
Signed-off-by: DarkLight1337 <[email protected]>
Signed-off-by: Lucas Wilkinson <[email protected]>
Signed-off-by: Harry Mellor <[email protected]>
Signed-off-by: Mengqing Cao <[email protected]>
Signed-off-by: mgoin <[email protected]>
Signed-off-by: jiang1.li <[email protected]>
Signed-off-by: jiang.li <[email protected]>
Signed-off-by: Chen Zhang <[email protected]>
Signed-off-by: Aaron Pham <[email protected]>
Signed-off-by: windsonsea <[email protected]>
Signed-off-by: reidliu41 <[email protected]>
Signed-off-by: dtransposed <damian@damian-ml-machine.europe-west3-b.c.jetbrains-grazie.internal>
Signed-off-by: dtransposed <>
Signed-off-by: Gregory Shtrasberg <[email protected]>
Signed-off-by: Thomas Parnell <[email protected]>
Signed-off-by: [email protected] <[email protected]>
Co-authored-by: Stan Wozniak <[email protected]>
Co-authored-by: Thomas Ortner <[email protected]>
Co-authored-by: Cyrus Leung <[email protected]>
Co-authored-by: Tyler Michael Smith <[email protected]>
Co-authored-by: Richard Zou <[email protected]>
Co-authored-by: Mikhail Podvitskii <[email protected]>
Co-authored-by: Cyrus Leung <[email protected]>
Co-authored-by: Lucas Wilkinson <[email protected]>
Co-authored-by: Harry Mellor <[email protected]>
Co-authored-by: Mengqing Cao <[email protected]>
Co-authored-by: Michael Goin <[email protected]>
Co-authored-by: Li, Jiang <[email protected]>
Co-authored-by: Chen Zhang <[email protected]>
Co-authored-by: Aaron Pham <[email protected]>
Co-authored-by: Michael Yao <[email protected]>
Co-authored-by: Reid <[email protected]>
Co-authored-by: reidliu41 <[email protected]>
Co-authored-by: Jevin Jiang <[email protected]>
Co-authored-by: d.transposed <[email protected]>
Co-authored-by: dtransposed <damian@damian-ml-machine.europe-west3-b.c.jetbrains-grazie.internal>
Co-authored-by: Gregory Shtrasberg <[email protected]>
Co-authored-by: Thomas Parnell <[email protected]>
Co-authored-by: Lucas Wilkinson <[email protected]>
RichardoMrMu pushed a commit to RichardoMrMu/vllm that referenced this pull request May 12, 2025
mawong-amd pushed a commit to ROCm/vllm that referenced this pull request May 14, 2025
gshtras added a commit to ROCm/vllm that referenced this pull request May 15, 2025
Signed-off-by: Gregory Shtrasberg <[email protected]>
gshtras added a commit to ROCm/vllm that referenced this pull request May 15, 2025
Signed-off-by: Gregory Shtrasberg <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ready ONLY add when PR is ready to merge/full CI is needed v1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants