[V1] Regression on scheduler

We found a regression during the development of #54, an assert `spyre_model_runner.py` fails probably due to behavior change of vllm engine's scheduler. 

> Note: #54 is required to repro the issue due to broken imports caused by recent refactoring of vLLM  

Repro on:
```
commit 27df5199d99627e1eb101071c2155f888181bd64 (HEAD -> main, origin/main, origin/HEAD)
```

This simple offline script can consistently repro the issue.

``` py
from vllm import LLM, SamplingParams

# Define prompts and their corresponding sampling parameters
prompts = [
    "Hello, my name is",
    "The capital of France is",
    "The future of AI is"
]
sampling_params_list = [
    SamplingParams(seed=123, temperature=0.8, top_p=0.95),
    SamplingParams(seed=123, temperature=0.5, top_p=0.9),
    SamplingParams(seed=123, temperature=0.7, top_p=0.85)
]

model = "/models/llama-194m/"
llm = LLM(model=model, enforce_eager=False)

# Generate texts for each prompt with its sampling parameters
outputs = llm.generate(prompts, sampling_params_list)

# Print the outputs
for response in outputs:
    print(f"Prompt: (response.prompt!r), Generated text: {response.outputs[0].text!r}")

```

Outputs:

```
Prompt: (response.prompt!r), Generated text: " 5c1. I'm a teacher, and you teach me how to"
Prompt: (response.prompt!r), Generated text: ' Paris. It is located in the center of France and is the largest city in'
Prompt: (response.prompt!r), Generated text: ' in the hands of machine learning, which is the process of a computer learning to'
ERROR 03-26 20:20:17 [core.py:344] EngineCore hit an exception: Traceback (most recent call last):
ERROR 03-26 20:20:17 [core.py:344]   File "/opt/vllm/lib64/python3.11/site-packages/vllm/v1/engine/core.py", line 337, in run_engine_core
ERROR 03-26 20:20:17 [core.py:344]     engine_core.run_busy_loop()
ERROR 03-26 20:20:17 [core.py:344]   File "/opt/vllm/lib64/python3.11/site-packages/vllm/v1/engine/core.py", line 371, in run_busy_loop
ERROR 03-26 20:20:17 [core.py:344]     outputs = step_fn()
ERROR 03-26 20:20:17 [core.py:344]               ^^^^^^^^^
ERROR 03-26 20:20:17 [core.py:344]   File "/opt/vllm/lib64/python3.11/site-packages/vllm/v1/engine/core.py", line 196, in step
ERROR 03-26 20:20:17 [core.py:344]     output = self.model_executor.execute_model(scheduler_output)
ERROR 03-26 20:20:17 [core.py:344]              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 03-26 20:20:17 [core.py:344]   File "/opt/vllm/lib64/python3.11/site-packages/vllm/v1/executor/abstract.py", line 77, in execute_model
ERROR 03-26 20:20:17 [core.py:344]     output = self.collective_rpc("execute_model",
ERROR 03-26 20:20:17 [core.py:344]              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 03-26 20:20:17 [core.py:344]   File "/opt/vllm/lib64/python3.11/site-packages/vllm/executor/uniproc_executor.py", line 56, in collective_rpc
ERROR 03-26 20:20:17 [core.py:344]     answer = run_method(self.driver_worker, method, args, kwargs)
ERROR 03-26 20:20:17 [core.py:344]              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 03-26 20:20:17 [core.py:344]   File "/opt/vllm/lib64/python3.11/site-packages/vllm/utils.py", line 2257, in run_method
ERROR 03-26 20:20:17 [core.py:344]     return func(*args, **kwargs)
ERROR 03-26 20:20:17 [core.py:344]            ^^^^^^^^^^^^^^^^^^^^^
ERROR 03-26 20:20:17 [core.py:344]   File "/usr/local/lib64/python3.11/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
ERROR 03-26 20:20:17 [core.py:344]     return func(*args, **kwargs)
ERROR 03-26 20:20:17 [core.py:344]            ^^^^^^^^^^^^^^^^^^^^^
ERROR 03-26 20:20:17 [core.py:344]   File "/opt/vllm/lib64/python3.11/site-packages/vllm_spyre/v1/worker/spyre_worker.py", line 370, in execute_model
ERROR 03-26 20:20:17 [core.py:344]     output = self.model_runner.execute_model(scheduler_output)
ERROR 03-26 20:20:17 [core.py:344]              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 03-26 20:20:17 [core.py:344]   File "/usr/local/lib64/python3.11/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
ERROR 03-26 20:20:17 [core.py:344]     return func(*args, **kwargs)
ERROR 03-26 20:20:17 [core.py:344]            ^^^^^^^^^^^^^^^^^^^^^
ERROR 03-26 20:20:17 [core.py:344]   File "/opt/vllm/lib64/python3.11/site-packages/vllm_spyre/v1/worker/spyre_model_runner.py", line 304, in execute_model
ERROR 03-26 20:20:17 [core.py:344]     model_input = self.prepare_model_input(scheduler_output)
ERROR 03-26 20:20:17 [core.py:344]                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 03-26 20:20:17 [core.py:344]   File "/opt/vllm/lib64/python3.11/site-packages/vllm_spyre/v1/worker/spyre_model_runner.py", line 283, in prepare_model_input
ERROR 03-26 20:20:17 [core.py:344]     self._prepare_decode(scheduler_output.scheduled_cached_reqs)
ERROR 03-26 20:20:17 [core.py:344]   File "/opt/vllm/lib64/python3.11/site-packages/vllm_spyre/v1/worker/spyre_model_runner.py", line 203, in _prepare_decode
ERROR 03-26 20:20:17 [core.py:344]     assert len(cached_requests) > 0
ERROR 03-26 20:20:17 [core.py:344]            ^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 03-26 20:20:17 [core.py:344] AssertionError
ERROR 03-26 20:20:17 [core.py:344] 
CRITICAL 03-26 20:20:17 [core_client.py:269] Got fatal signal from worker processes, shutting down. See stack trace above for root cause issue.
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[V1] Regression on scheduler #55

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[V1] Regression on scheduler #55

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions