🐛 fix static scheduling issues with long prompts #206

joerunde · 2025-06-03T21:48:42Z

This fixes an issue where batches of requests with long prompts would not properly be scheduled. Even with a long queue of requests, smaller-than-full batches would be scheduled because requests would be rejected from the schedule once the total number of prompt tokens was >= --max-num-batched-tokens.

The --max-num-batched-tokens config is designed for chunking up prompts for chunked prefill, and isn't relevant for static batching. This PR sets --max-num-batched-tokens to the maximum number of prompt tokens that could be in a full batch, so that the chunked prefill logic doesn't prevent us from creating large batches.

I missed this before because it requires a lower level test that invokes the scheduler directly in order to ensure that the full batches are actually scheduled.

Signed-off-by: Joe Runde <[email protected]>

github-actions · 2025-06-03T21:48:51Z

👋 Hi! Thank you for contributing to vLLM support on Spyre.
Just a reminder: Make sure that your code passes all the linting checks, otherwise your PR won't be able to be merged. To do so, first install the linting requirements, then run format.sh and commit the changes. This can be done with uv directly:

uv sync --frozen --group lint --active --inexact

Or this can be done with pip:

uv pip compile --group lint > requirements-lint.txt
pip install -r requirements-lint.txt
bash format.sh

Now you are good to go 🚀

Daniel-Schenker · 2025-06-03T21:54:47Z

Looks good, I'll get this built and tested on Power. Thanks Joe.

tdoublep

Nice simple fix and well-written test. LGTM!

yannicks1

Very nice catch and fix, Joe. Left a minor comment:)

yannicks1 · 2025-06-04T08:33:29Z

tests/e2e/test_spyre_basic.py

+
+    vllm_sampling_params = SamplingParams(max_tokens=20,
+                                          temperature=0,
+                                          stop="1",


I guess we don't need stop="1" here, looks like a copy paste relict:)

Signed-off-by: Joe Runde <[email protected]>

joerunde and others added 2 commits June 3, 2025 15:45

🐛 fix static scheduling issues with long prompts

1e671ab

Signed-off-by: Joe Runde <[email protected]>

Merge branch 'main' into block-problems

0fbae23

tdoublep approved these changes Jun 4, 2025

View reviewed changes

yannicks1 approved these changes Jun 4, 2025

View reviewed changes

joerunde added 2 commits June 4, 2025 11:36

Merge branch 'main' into block-problems

a2818a6

🔥 remove lil stop string

e85572d

Signed-off-by: Joe Runde <[email protected]>

joerunde requested review from nikolaospapandreou, prashantgupta24, rafvasq and sducouedic as code owners June 4, 2025 17:37

joerunde enabled auto-merge (squash) June 4, 2025 17:37

github-actions bot added the ready label Jun 4, 2025

Merge branch 'main' into block-problems

8bb1bfd

joerunde merged commit 7fca47e into main Jun 4, 2025
21 checks passed

joerunde deleted the block-problems branch June 4, 2025 18:12

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

🐛 fix static scheduling issues with long prompts #206

🐛 fix static scheduling issues with long prompts #206

Uh oh!

joerunde commented Jun 3, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Jun 3, 2025

Uh oh!

Daniel-Schenker commented Jun 3, 2025

Uh oh!

tdoublep left a comment

Uh oh!

yannicks1 left a comment

Uh oh!

yannicks1 Jun 4, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

🐛 fix static scheduling issues with long prompts #206

🐛 fix static scheduling issues with long prompts #206

Uh oh!

Conversation

joerunde commented Jun 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Jun 3, 2025

Uh oh!

Daniel-Schenker commented Jun 3, 2025

Uh oh!

tdoublep left a comment

Choose a reason for hiding this comment

Uh oh!

yannicks1 left a comment

Choose a reason for hiding this comment

Uh oh!

yannicks1 Jun 4, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

joerunde commented Jun 3, 2025 •

edited

Loading