Skip to content

Conversation

sducouedic
Copy link
Collaborator

@sducouedic sducouedic commented Jun 3, 2025

Copy link

github-actions bot commented Jun 3, 2025

👋 Hi! Thank you for contributing to vLLM support on Spyre.
Just a reminder: Make sure that your code passes all the linting checks, otherwise your PR won't be able to be merged. To do so, first install the linting requirements, then run format.sh and commit the changes. This can be done with uv directly:

uv sync --frozen --group lint --active --inexact

Or this can be done with pip:

uv pip compile --group lint > requirements-lint.txt
pip install -r requirements-lint.txt
bash format.sh

Now you are good to go 🚀

@prashantgupta24
Copy link
Collaborator

I like it!

Signed-off-by: Sophie du Couédic <[email protected]>
@sducouedic sducouedic force-pushed the test_cb_scheduler branch from 101a1c3 to 2eb3ad8 Compare June 3, 2025 21:01
sducouedic and others added 2 commits June 3, 2025 23:02
@sducouedic sducouedic marked this pull request as ready for review June 3, 2025 21:21
@sducouedic sducouedic changed the title Test continuous batching through the scheduler [CB] Test continuous batching through the scheduler Jun 3, 2025
# Prefill sequence 2
# TODO @Yannick, should the left padding ideally already be removed?
"step": 42,
"tkv": 103, # <-- should be 64?
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From what I understand tkv is not 64 because this is not a new batch yet since request 1 is still running?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you're other comment is correct, it is about calling reduce_left_padding

},
{
# Prefill sequence 2
# TODO @Yannick, should the left padding ideally already be removed?
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The left padding is not removed because I see the reduce_left_padding function is called when preparing decode and not prefill?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you're right! I am wondering if it is the desired behaviour, and if something prevents to have the left padding already removed in the subsequent prefill. Conceptually it makes more sense to have the padding already removed if a prefill happens (ie. call reduce_left_padding in both decode and prefill). Also supposing the tkv is too close to max_model_len at the time of the prefill, the prompt would need to wait 1 decode, instead of being directly scheduled if the padding reduction happens directly.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, yeah, that does make sense to me.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although it might be counter-intuitive since the left padding is introduced when you prepare a prefill for a new request? So ideally we could add logic while adding the padding instead of adding then reducing the padding if that makes sense

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah I see what you mean, that would be more code instead of just calling the reduce_left_padding function after adding the padding. Let's see what @yannicks1 has to say

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do agree and am working on reducing the left padding in every scheduler step (prefil and decode). Thanks for pointing this out!

Copy link
Collaborator

@prashantgupta24 prashantgupta24 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I guess the left padding question still remains.

@yannicks1 yannicks1 self-requested a review June 4, 2025 09:10
Copy link
Collaborator

@yannicks1 yannicks1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is really nice work @sducouedic. Super clean tests on scheduler step level. I suggest to merge this. I am already preparing the improved reducing of the left padding on another branch and would then update the tests accordingly.

},
{
# Prefill sequence 2
# TODO @Yannick, should the left padding ideally already be removed?
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do agree and am working on reducing the left padding in every scheduler step (prefil and decode). Thanks for pointing this out!

Co-authored-by: Yannick Schnider <[email protected]>
Signed-off-by: Sophie du Couédic <[email protected]>
@sducouedic sducouedic requested a review from rafvasq as a code owner June 5, 2025 11:35
@sducouedic sducouedic force-pushed the test_cb_scheduler branch from b34d290 to fdc8081 Compare June 5, 2025 11:50
@sducouedic sducouedic enabled auto-merge (squash) June 5, 2025 11:51
@github-actions github-actions bot added the ready label Jun 5, 2025
@sducouedic sducouedic merged commit 153fd2a into main Jun 5, 2025
22 checks passed
@sducouedic sducouedic deleted the test_cb_scheduler branch June 5, 2025 12:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[CB] Test corner cases

3 participants