Don't allow warmup shapes that exceed the max sequence length #185

maxdebayser · 2025-05-27T18:04:49Z

In V0, warmup shapes that result in sequence lengths longer than the maximum sequence length that the model supports are not validated. When a request that is between the two values comes in, it results in a server crash:

WARNING 04-23 02:30:31 [scheduler.py:717] Input prompt (306 tokens) is too long and exceeds limit of 256
CRITICAL 04-23 02:30:31 [launcher.py:116] MQLLMEngine is already dead, terminating server process
INFO:     127.0.0.1:54294 - "POST /v1/embeddings HTTP/1.1" 500 Internal Server Error
ERROR 04-23 02:30:31 [engine.py:160] ValueError('Sampling parameters are missing for a CompletionRequest.')
ERROR 04-23 02:30:31 [engine.py:160] Traceback (most recent call last):
ERROR 04-23 02:30:31 [engine.py:160]   File "/opt/vllm/lib64/python3.11/site-packages/vllm/engine/multiprocessing/engine.py", line 158, in start
ERROR 04-23 02:30:31 [engine.py:160]     self.run_engine_loop()
ERROR 04-23 02:30:31 [engine.py:160]   File "/opt/vllm/lib64/python3.11/site-packages/vllm/engine/multiprocessing/engine.py", line 221, in run_engine_loop
ERROR 04-23 02:30:31 [engine.py:160]     request_outputs = self.engine_step()
ERROR 04-23 02:30:31 [engine.py:160]                       ^^^^^^^^^^^^^^^^^^
ERROR 04-23 02:30:31 [engine.py:160]   File "/opt/vllm/lib64/python3.11/site-packages/vllm/engine/multiprocessing/engine.py", line 247, in engine_step
ERROR 04-23 02:30:31 [engine.py:160]     raise e
ERROR 04-23 02:30:31 [engine.py:160]   File "/opt/vllm/lib64/python3.11/site-packages/vllm/engine/multiprocessing/engine.py", line 230, in engine_step
ERROR 04-23 02:30:31 [engine.py:160]     return self.engine.step()
ERROR 04-23 02:30:31 [engine.py:160]            ^^^^^^^^^^^^^^^^^^
ERROR 04-23 02:30:31 [engine.py:160]   File "/opt/vllm/lib64/python3.11/site-packages/vllm/engine/llm_engine.py", line 1493, in step
ERROR 04-23 02:30:31 [engine.py:160]     self._process_model_outputs(ctx=ctx)
ERROR 04-23 02:30:31 [engine.py:160]   File "/opt/vllm/lib64/python3.11/site-packages/vllm/engine/llm_engine.py", line 1220, in _process_model_outputs
ERROR 04-23 02:30:31 [engine.py:160]     request_output = RequestOutputFactory.create(
ERROR 04-23 02:30:31 [engine.py:160]                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 04-23 02:30:31 [engine.py:160]   File "/opt/vllm/lib64/python3.11/site-packages/vllm/outputs.py", line 392, in create
ERROR 04-23 02:30:31 [engine.py:160]     return RequestOutput.from_seq_group(seq_group, use_cache,
ERROR 04-23 02:30:31 [engine.py:160]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 04-23 02:30:31 [engine.py:160]   File "/opt/vllm/lib64/python3.11/site-packages/vllm/outputs.py", line 181, in from_seq_group
ERROR 04-23 02:30:31 [engine.py:160]     raise ValueError(
ERROR 04-23 02:30:31 [engine.py:160] ValueError: Sampling parameters are missing for a CompletionRequest.
INFO:     Shutting down
INFO:     Waiting for application shutdown.
INFO:     Application shutdown complete.
INFO:     Finished server process [705]

Signed-off-by: Max de Bayser <[email protected]>

github-actions · 2025-05-27T18:04:59Z

👋 Hi! Thank you for contributing to vLLM support on Spyre.
Just a reminder: Make sure that your code passes all the linting checks, otherwise your PR won't be able to be merged. To do so, first install the linting requirements, then run format.sh and commit the changes. This can be done with uv directly:

uv sync --frozen --group lint --active --inexact

Or this can be done with pip:

uv pip compile --group lint > requirements-lint.txt
pip install -r requirements-lint.txt
bash format.sh

Now you are good to go 🚀

tjohnson31415 · 2025-05-27T19:26:19Z

vllm_spyre/platform.py

                max_seq_len = max(max_seq_len,
                                  shape["prompt_length"] + shape["new_tokens"])
+                if max_seq_len > max_model_len:
+                    raise RuntimeError(


Could this check be moved into get_warmup_shapes where other validations on the warmup shapes occur?

yes, this sounds reasonable. other than that it looks good to me.

I did move it into get_warmup_shapes, but now it requires an extra parameter.

yannicks1

lgtm, when moving check into cls.get_warmup_shapes()

Signed-off-by: Max de Bayser <[email protected]>

Don't allow warmup shapes that exceed the max sequence length

f5ea391

Signed-off-by: Max de Bayser <[email protected]>

maxdebayser requested review from tjohnson31415 and yannicks1 May 27, 2025 18:04

tjohnson31415 reviewed May 27, 2025

View reviewed changes

yannicks1 approved these changes Jun 4, 2025

View reviewed changes

maxdebayser added 2 commits June 4, 2025 16:47

Merge branch 'main' into model_len_constraint

0a0f753

put all validations in the same place

cdbbc1e

Signed-off-by: Max de Bayser <[email protected]>

maxdebayser requested review from nikolaospapandreou, sducouedic and tdoublep as code owners June 4, 2025 19:57

Merge branch 'main' into model_len_constraint

67a8e68

joerunde enabled auto-merge (squash) June 5, 2025 16:23

github-actions bot added the ready label Jun 5, 2025

joerunde merged commit c0269a3 into main Jun 5, 2025
22 checks passed

joerunde deleted the model_len_constraint branch June 5, 2025 17:01

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Don't allow warmup shapes that exceed the max sequence length #185

Don't allow warmup shapes that exceed the max sequence length #185

Uh oh!

maxdebayser commented May 27, 2025

Uh oh!

github-actions bot commented May 27, 2025

Uh oh!

tjohnson31415 May 27, 2025

Uh oh!

yannicks1 Jun 4, 2025

Uh oh!

maxdebayser Jun 4, 2025

Uh oh!

yannicks1 left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Don't allow warmup shapes that exceed the max sequence length #185

Don't allow warmup shapes that exceed the max sequence length #185

Uh oh!

Conversation

maxdebayser commented May 27, 2025

Uh oh!

github-actions bot commented May 27, 2025

Uh oh!

tjohnson31415 May 27, 2025

Choose a reason for hiding this comment

Uh oh!

yannicks1 Jun 4, 2025

Choose a reason for hiding this comment

Uh oh!

maxdebayser Jun 4, 2025

Choose a reason for hiding this comment

Uh oh!

yannicks1 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants