debug test sampler inconsistency #16939

Chenyaaang · 2025-04-21T19:09:23Z

this fails vllm/tests/v1/tpu/test_sampler.py at line 46, basically when it asserts same sampling_params should have same result, it fails.
the results are:
"The robot is a humanoid with a sleek, metallic body and a pair of glowing eyes. It has been programmed to follow orders and perform tasks, but it has always been a machine, devoid of emotions and consciousness. One day, while working in a factory, the robot overhears a conversation between two workers. They are"
vs
"The robot is a humanoid with a sleek, metallic body and a pair of glowing eyes. It has been programmed to perform tasks for humans, but it has always been a machine, devoid of emotions or consciousness. One day, while working in a factory, the robot experiences a strange sensation in its mind. It begins to"

github-actions · 2025-04-21T19:09:32Z

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

🚀

vanbasten23 · 2025-04-21T19:49:56Z

tests/v1/tpu/test_sampler.py

@@ -23,7 +23,7 @@ def test_sampler_different(model_name: str):
    different results.
    """
    llm = LLM(model_name,
-              enforce_eager=False,
+              enforce_eager=True,


You run the test as VLLM_USE_V1=1 python tests/v1/tpu/test_sampler.py?

yes, VLLM_USE_V1=1 pytest -s -vv tests/v1/tpu/test_sampler.py

enabling enforce_eager makes the test faster.

and the temperature is 0?

temperature is set below, for the test I'm failing, batch size = 16, the temperatures are all 0.1, so it's using the sampler instead of greedy sampling.

sorry updated in description, when it asserts same sampling_params should lead to the same result, it fails.

Signed-off-by: Chenyaaang <[email protected]>

mgoin · 2025-04-21T20:32:29Z

There are natural numerical instabilities that make it difficult to have deterministic results over many tokens. I think it is reasonable for your test that ~20 tokens match before it starts to diverge.

NickLucche · 2025-04-22T10:06:59Z

I agree with @mgoin , I think you can safely loosen the checks on this test in your PR #16499 to land it.

My initial intention with this test was actually to test that different results would be produced when significantly different sampling params were provided. I think this particular check was changed after review.
Perhaps we could bring it back to its original logic.

mergify bot added v1 tpu Related to Google TPUs labels Apr 21, 2025

Chenyaaang force-pushed the debug-test-sampler branch 3 times, most recently from 1d73364 to 0e80e6b Compare April 21, 2025 19:18

vanbasten23 reviewed Apr 21, 2025

View reviewed changes

debug test sampler inconsitency

10ebc74

Signed-off-by: Chenyaaang <[email protected]>

Chenyaaang force-pushed the debug-test-sampler branch from 0e80e6b to 10ebc74 Compare April 21, 2025 19:55

Chenyaaang closed this Apr 23, 2025

Chenyaaang deleted the debug-test-sampler branch April 23, 2025 22:36

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

debug test sampler inconsistency #16939

debug test sampler inconsistency #16939

Uh oh!

Chenyaaang commented Apr 21, 2025 •

edited by github-actions bot

Loading

Uh oh!

github-actions bot commented Apr 21, 2025

Uh oh!

vanbasten23 Apr 21, 2025

Uh oh!

Chenyaaang Apr 21, 2025

Uh oh!

Chenyaaang Apr 21, 2025

Uh oh!

vanbasten23 Apr 21, 2025

Uh oh!

Chenyaaang Apr 21, 2025 •

edited

Loading

Uh oh!

Chenyaaang Apr 21, 2025

Uh oh!

mgoin commented Apr 21, 2025

Uh oh!

NickLucche commented Apr 22, 2025

Uh oh!

Uh oh!

Uh oh!

debug test sampler inconsistency #16939

debug test sampler inconsistency #16939

Uh oh!

Conversation

Chenyaaang commented Apr 21, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Apr 21, 2025

Uh oh!

vanbasten23 Apr 21, 2025

Choose a reason for hiding this comment

Uh oh!

Chenyaaang Apr 21, 2025

Choose a reason for hiding this comment

Uh oh!

Chenyaaang Apr 21, 2025

Choose a reason for hiding this comment

Uh oh!

vanbasten23 Apr 21, 2025

Choose a reason for hiding this comment

Uh oh!

Chenyaaang Apr 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Chenyaaang Apr 21, 2025

Choose a reason for hiding this comment

Uh oh!

mgoin commented Apr 21, 2025

Uh oh!

NickLucche commented Apr 22, 2025

Uh oh!

Uh oh!

Chenyaaang commented Apr 21, 2025 •

edited by github-actions bot

Loading

Chenyaaang Apr 21, 2025 •

edited

Loading