-
-
Notifications
You must be signed in to change notification settings - Fork 7.6k
debug test sampler inconsistency #16939
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
👋 Hi! Thank you for contributing to the vLLM project. 💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels. Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging. To run CI, PR reviewers can either: Add 🚀 |
1d73364
to
0e80e6b
Compare
@@ -23,7 +23,7 @@ def test_sampler_different(model_name: str): | |||
different results. | |||
""" | |||
llm = LLM(model_name, | |||
enforce_eager=False, | |||
enforce_eager=True, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You run the test as VLLM_USE_V1=1 python tests/v1/tpu/test_sampler.py
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, VLLM_USE_V1=1 pytest -s -vv tests/v1/tpu/test_sampler.py
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
enabling enforce_eager makes the test faster.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and the temperature is 0?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
temperature is set below, for the test I'm failing, batch size = 16, the temperatures are all 0.1, so it's using the sampler instead of greedy sampling.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry updated in description, when it asserts same sampling_params should lead to the same result, it fails.
Signed-off-by: Chenyaaang <[email protected]>
0e80e6b
to
10ebc74
Compare
There are natural numerical instabilities that make it difficult to have deterministic results over many tokens. I think it is reasonable for your test that ~20 tokens match before it starts to diverge. |
I agree with @mgoin , I think you can safely loosen the checks on this test in your PR #16499 to land it. My initial intention with this test was actually to test that different results would be produced when significantly different sampling params were provided. I think this particular check was changed after review. |
this fails vllm/tests/v1/tpu/test_sampler.py at line 46, basically when it asserts same sampling_params should have same result, it fails.
the results are:
"The robot is a humanoid with a sleek, metallic body and a pair of glowing eyes. It has been programmed to follow orders and perform tasks, but it has always been a machine, devoid of emotions and consciousness. One day, while working in a factory, the robot overhears a conversation between two workers. They are"
vs
"The robot is a humanoid with a sleek, metallic body and a pair of glowing eyes. It has been programmed to perform tasks for humans, but it has always been a machine, devoid of emotions or consciousness. One day, while working in a factory, the robot experiences a strange sensation in its mind. It begins to"