-
Notifications
You must be signed in to change notification settings - Fork 26
[fp8] fix cb scheduler step tests #491
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: Yannick Schnider <[email protected]>
👋 Hi! Thank you for contributing to vLLM support on Spyre.
Or this can be done with
Now you are good to go 🚀 |
tests/llm_cache.py
Outdated
tokenizer=model_name, | ||
max_model_len=max(max_model_len, 256), | ||
max_num_seqs=max_num_seqs, | ||
max_num_seqs=max(max_num_seqs, 4), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this will leave the test name as-is, which might be confusing (since the test will still say BS 2)
Signed-off-by: Yannick Schnider <[email protected]>
Signed-off-by: Yannick Schnider <[email protected]>
Signed-off-by: Yannick Schnider <[email protected]>
bot:test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm fine with this as long as the tests pass. Just for the record, can you add an explanation either here in the PR or in the code as to why this is required?
+1, can we extend the 🌶️🌶️🌶️ comment right above these changes? |
Another +1 for the documentation of the solution and question. If we have this constraint of batch size, should we update platform as well somehow to give feedback to the user? |
Signed-off-by: Yannick Schnider <[email protected]>
just added some more comments |
bot:test |
bot:test |
this change does always compile for the next power of 2 for the batch size.