-
Notifications
You must be signed in to change notification settings - Fork 26
[Bugfix][V0][V1] Fix crashes from cancelling requests #64
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
👋 Hi! Thank you for contributing to vLLM support on Spyre.
Now you are good to go 🚀 |
2e0e31e
to
71435ab
Compare
vllm_spyre/v1/core/scheduler.py
Outdated
else: | ||
# this try-except is the specialization for Spyre | ||
try: | ||
self.holdback_queue.remove(request) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah dang, this is unfortunate.
I think maybe we can fix this in a simpler way by removing self.holdback_queue
as an instance attribute, and instead just make it a local variable during self.schedule()
. After we schedule a new batch, we can take all the requests that we held back and put them back in self.waiting
, and then we won't need to worry about breaking assumptions that the v1 scheduler has
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That would let us get rid of the override on get_num_unfinished_requests
as well
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done!
I left it as an instance variable so that we don't remake the deque. It is also still used in _handle_rejects
, though there probably can't be rejection during scheduling?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, I think that it should be safe to remove it from usage in _handle_rejects
as well since this should all be synchronous. The output processing that calls _handle_rejects
can't be happening concurrently with scheduling a new forward pass
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll leave it up to you on merging now vs. pulling out of _handle_rejects as well. I think we'll be getting rid of this rejected request business soon anyway
71435ab
to
1aabb0b
Compare
Signed-off-by: Travis Johnson <[email protected]>
Signed-off-by: Travis Johnson <[email protected]>
Signed-off-by: Travis Johnson <[email protected]>
Signed-off-by: Travis Johnson <[email protected]>
Signed-off-by: Travis Johnson <[email protected]>
1aabb0b
to
d837329
Compare
* fix: add optional arg to abort_seq_group for compat with v0.8 Signed-off-by: Travis Johnson <[email protected]> * fix: guard against KeyError with _req_ids2idx Signed-off-by: Travis Johnson <[email protected]> * fix: specialize finish_requests in V1 scheduler Signed-off-by: Travis Johnson <[email protected]> * fix: check against None... Signed-off-by: Travis Johnson <[email protected]> * refactor: make holdback queue use more temporary Signed-off-by: Travis Johnson <[email protected]> --------- Signed-off-by: Travis Johnson <[email protected]>
Some fixes from testing handling of request cancellation:
_req_ids2idx
finish_requests()
to handle the holdback_queueFIX #36