You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Don't allow warmup shapes that exceed the max sequence length (#185)
In V0, warmup shapes that result in sequence lengths longer than the
maximum sequence length that the model supports are not validated. When
a request that is between the two values comes in, it results in a
server crash:
```
WARNING 04-23 02:30:31 [scheduler.py:717] Input prompt (306 tokens) is too long and exceeds limit of 256
CRITICAL 04-23 02:30:31 [launcher.py:116] MQLLMEngine is already dead, terminating server process
INFO: 127.0.0.1:54294 - "POST /v1/embeddings HTTP/1.1" 500 Internal Server Error
ERROR 04-23 02:30:31 [engine.py:160] ValueError('Sampling parameters are missing for a CompletionRequest.')
ERROR 04-23 02:30:31 [engine.py:160] Traceback (most recent call last):
ERROR 04-23 02:30:31 [engine.py:160] File "/opt/vllm/lib64/python3.11/site-packages/vllm/engine/multiprocessing/engine.py", line 158, in start
ERROR 04-23 02:30:31 [engine.py:160] self.run_engine_loop()
ERROR 04-23 02:30:31 [engine.py:160] File "/opt/vllm/lib64/python3.11/site-packages/vllm/engine/multiprocessing/engine.py", line 221, in run_engine_loop
ERROR 04-23 02:30:31 [engine.py:160] request_outputs = self.engine_step()
ERROR 04-23 02:30:31 [engine.py:160] ^^^^^^^^^^^^^^^^^^
ERROR 04-23 02:30:31 [engine.py:160] File "/opt/vllm/lib64/python3.11/site-packages/vllm/engine/multiprocessing/engine.py", line 247, in engine_step
ERROR 04-23 02:30:31 [engine.py:160] raise e
ERROR 04-23 02:30:31 [engine.py:160] File "/opt/vllm/lib64/python3.11/site-packages/vllm/engine/multiprocessing/engine.py", line 230, in engine_step
ERROR 04-23 02:30:31 [engine.py:160] return self.engine.step()
ERROR 04-23 02:30:31 [engine.py:160] ^^^^^^^^^^^^^^^^^^
ERROR 04-23 02:30:31 [engine.py:160] File "/opt/vllm/lib64/python3.11/site-packages/vllm/engine/llm_engine.py", line 1493, in step
ERROR 04-23 02:30:31 [engine.py:160] self._process_model_outputs(ctx=ctx)
ERROR 04-23 02:30:31 [engine.py:160] File "/opt/vllm/lib64/python3.11/site-packages/vllm/engine/llm_engine.py", line 1220, in _process_model_outputs
ERROR 04-23 02:30:31 [engine.py:160] request_output = RequestOutputFactory.create(
ERROR 04-23 02:30:31 [engine.py:160] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 04-23 02:30:31 [engine.py:160] File "/opt/vllm/lib64/python3.11/site-packages/vllm/outputs.py", line 392, in create
ERROR 04-23 02:30:31 [engine.py:160] return RequestOutput.from_seq_group(seq_group, use_cache,
ERROR 04-23 02:30:31 [engine.py:160] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 04-23 02:30:31 [engine.py:160] File "/opt/vllm/lib64/python3.11/site-packages/vllm/outputs.py", line 181, in from_seq_group
ERROR 04-23 02:30:31 [engine.py:160] raise ValueError(
ERROR 04-23 02:30:31 [engine.py:160] ValueError: Sampling parameters are missing for a CompletionRequest.
INFO: Shutting down
INFO: Waiting for application shutdown.
INFO: Application shutdown complete.
INFO: Finished server process [705]
```
---------
Signed-off-by: Max de Bayser <[email protected]>
Co-authored-by: Joe Runde <[email protected]>
0 commit comments