-
Notifications
You must be signed in to change notification settings - Fork 26
✅ add assertions for warmup mode context #294
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
👋 Hi! Thank you for contributing to vLLM support on Spyre. Or this can be done with Now you are good to go 🚀 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting idea
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer if we addressed this with a refactor to put everything that should be inside the warmup context into its own method. As is the full warmup method here is already ~150 lines and should be broken up anyway
# setup dummy_requests
with _maybe_warmup_context():
self._dynamic_warmup(dummy_requests)
# the rest of the post-compile warmup
Signed-off-by: Prashant Gupta <[email protected]>
4a0d452 to
aa28708
Compare
Signed-off-by: Prashant Gupta <[email protected]>
Signed-off-by: Prashant Gupta <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
Description
This PR adds assertions for operations done within the
warmup context. Earlier we saw a bug in which thedecodewarmup was moved outside the warmup context because of one simple indentation issue and it caused a segmentation fault with no clue as to why it was happening.These assertions should catch the error when running even on CPU.