Ensure that warmup shapes are available after forking. #47

tdoublep · 2025-03-25T10:30:40Z

I need to do a demo using V0 online serving today, and I noticed that it is currently not working using latest aiu-vllm-dev image:

Trying to deploy the following:

python3 -m vllm.entrypoints.openai.api_server --model /models/llama-194m/ --max-model-len=2048 --block-size=128

produces:

[SpyreWorker] load model...
ERROR 03-25 09:16:21 [engine.py:448] type object 'SpyrePlatform' has no attribute 'spyre_warmup_shapes'
ERROR 03-25 09:16:21 [engine.py:448] Traceback (most recent call last):
ERROR 03-25 09:16:21 [engine.py:448]   File "/opt/vllm/lib64/python3.11/site-packages/vllm/engine/multiprocessing/engine.py", line 436, in run_mp_engine
ERROR 03-25 09:16:21 [engine.py:448]     engine = MQLLMEngine.from_vllm_config(
ERROR 03-25 09:16:21 [engine.py:448]              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 03-25 09:16:21 [engine.py:448]   File "/opt/vllm/lib64/python3.11/site-packages/vllm/engine/multiprocessing/engine.py", line 128, in from_vllm_config
ERROR 03-25 09:16:21 [engine.py:448]     return cls(
ERROR 03-25 09:16:21 [engine.py:448]            ^^^^
ERROR 03-25 09:16:21 [engine.py:448]   File "/opt/vllm/lib64/python3.11/site-packages/vllm/engine/multiprocessing/engine.py", line 82, in __init__
ERROR 03-25 09:16:21 [engine.py:448]     self.engine = LLMEngine(*args, **kwargs)
ERROR 03-25 09:16:21 [engine.py:448]                   ^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 03-25 09:16:21 [engine.py:448]   File "/opt/vllm/lib64/python3.11/site-packages/vllm/engine/llm_engine.py", line 280, in __init__
ERROR 03-25 09:16:21 [engine.py:448]     self.model_executor = executor_class(vllm_config=vllm_config, )
ERROR 03-25 09:16:21 [engine.py:448]                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 03-25 09:16:21 [engine.py:448]   File "/opt/vllm/lib64/python3.11/site-packages/vllm/executor/executor_base.py", line 52, in __init__
ERROR 03-25 09:16:21 [engine.py:448]     self._init_executor()
ERROR 03-25 09:16:21 [engine.py:448]   File "/opt/vllm/lib64/python3.11/site-packages/vllm/executor/uniproc_executor.py", line 47, in _init_executor
ERROR 03-25 09:16:21 [engine.py:448]     self.collective_rpc("load_model")
ERROR 03-25 09:16:21 [engine.py:448]   File "/opt/vllm/lib64/python3.11/site-packages/vllm/executor/uniproc_executor.py", line 56, in collective_rpc
ERROR 03-25 09:16:21 [engine.py:448]     answer = run_method(self.driver_worker, method, args, kwargs)
ERROR 03-25 09:16:21 [engine.py:448]              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 03-25 09:16:21 [engine.py:448]   File "/opt/vllm/lib64/python3.11/site-packages/vllm/utils.py", line 2216, in run_method
ERROR 03-25 09:16:21 [engine.py:448]     return func(*args, **kwargs)
ERROR 03-25 09:16:21 [engine.py:448]            ^^^^^^^^^^^^^^^^^^^^^
ERROR 03-25 09:16:21 [engine.py:448]   File "/opt/vllm/lib64/python3.11/site-packages/vllm_spyre/worker/spyre_worker.py", line 149, in load_model
ERROR 03-25 09:16:21 [engine.py:448]     spyre_warmup_shapes = current_platform.get_warmup_shapes()
ERROR 03-25 09:16:21 [engine.py:448]                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 03-25 09:16:21 [engine.py:448]   File "/opt/vllm/lib64/python3.11/site-packages/vllm_spyre/platform.py", line 167, in get_warmup_shapes
ERROR 03-25 09:16:21 [engine.py:448]     return cls.spyre_warmup_shapes
ERROR 03-25 09:16:21 [engine.py:448]            ^^^^^^^^^^^^^^^^^^^^^^^
ERROR 03-25 09:16:21 [engine.py:448] AttributeError: type object 'SpyrePlatform' has no attribute 'spyre_warmup_shapes'

It looks like the engine process is getting forked after we parse the warmup shapes in the main process, and thus in the engine process they don't exist.

It doesn't look like the platform is really the best place to store "state". Why don't we just use the config instead? I understand it is a little bit nasty since we can't "change" that class in the plugin, but it works very nicely.

github-actions · 2025-03-25T10:30:53Z

👋 Hi! Thank you for contributing to vLLM support on Spyre.
Just a reminder: Make sure that your code passes all the linting checks, otherwise your PR won't be able to be merged. To do so, first install the linting requirements, then run format.sh and commit the changes:

pip install -r requirements-lint.txt
bash format.sh

Now you are good to go 🚀

Signed-off-by: Thomas Parnell <[email protected]>

maxdebayser · 2025-03-25T12:52:26Z

So the warmup shapes are being parsed in the main server process while the first worker process that also contains the engine in V0 is already up and running?

Signed-off-by: Yannick Schnider <[email protected]>

yannicks1

I just fixed formatting and added the same logic to V1 classes. LGTM!

tjohnson31415 · 2025-03-25T15:40:24Z

So the warmup shapes are being parsed in the main server process while the first worker process that also contains the engine in V0 is already up and running?

No, the worker process comes up after the main server process parses the warmup shapes, but the shapes aren't re-parsed in the worker. This problem is related to VLLM_WORKER_MULTIPROC_METHOD. For the initial v1 integration @joerunde put in code to force using VLLM_WORKER_MULTIPROC_METHOD=fork:
https://github.com/vllm-project/vllm-spyre/pull/6/files#diff-16ac04c4e75668ccde20cd2cfb82fa496d5d98dbe169ffede15652dc60a16066R54-R62
(this code could be removed now in this PR)

The default of spawn meant that the spyre_warmup_shapes added to the SpyrePlatform class instance did not exist in the spawned worker (the worker process doesn't call set_warmup_shapes). With fork the modifications to the class persist in the worker.

Moving the warmup shapes to the scheduler_config works because the vllm_config is serialized to the worker class during initialization regardless of spawn vs fork.

tdoublep · 2025-03-25T15:53:45Z

@tjohnson31415 thank you for the correction + detailed explanation! I wrongly assumed that the default V0 behaviour was fork too. I will try it quickly now on main using fork.

Either way, I still think we should consider moving the shapes into the config. In fact, I wonder why we are really using env variables at all. Wouldn't we rather be passing them as proper arguments? If vLLM does not provide a way for plugins to add their own custom args/config then perhaps this is something we could change upstream?

tdoublep · 2025-03-25T16:49:57Z

Hmm, I get the same error even if I set:

export VLLM_WORKER_MULTIPROC_METHOD=fork

tjohnson31415 · 2025-03-25T17:38:23Z

Hmm, yeah, then this may be something else, not the serialization problem that I'm familiar with 🤔

joerunde · 2025-03-25T20:50:39Z

If vLLM does not provide a way for plugins to add their own custom args/config then perhaps this is something we could change upstream?

+1 on bringing that up upstream. I'd rather not have to do the entrypoint hijacking that the vllm tgis adapter does to add cli args.

For this PR though, do we need to store the warmup shapes on a config object at all? It seems like they can be rebuilt in each worker from the environment variables, and I'd prefer doing that now for robustness until we work out some upstream changes for plugins to be able to add their own arguments and config properly.

joerunde · 2025-03-25T20:57:40Z

Also.... we should probably have at least a single test that runs the openai server so we can catch these problems too!

tdoublep · 2025-03-26T09:00:20Z

Also.... we should probably have at least a single test that runs the openai server so we can catch these problems too!

Agreed. @dpatel-ops and I had discussed this before.

tdoublep · 2025-03-26T09:02:09Z

For this PR though, do we need to store the warmup shapes on a config object at all? It seems like they can be rebuilt in each worker from the environment variables, and I'd prefer doing that now for robustness until we work out some upstream changes for plugins to be able to add their own arguments and config properly.

@joerunde We can do it like that for now, yeah. I had an earlier attempt where I was doing that, just felt that cleaning in the config was cleaner since they are kept in "one place". It is not a huge difference though tbh.

joerunde · 2025-03-26T15:17:06Z

Added a followup issue to correctly handle cli args and configs here: #51

Signed-off-by: Joe Runde <[email protected]>

tdoublep

LGTM - thanks!

### [v0] replace current_platform with SpyrePlatform PR #47 missed replacing current_platform with SpyrePlatform in the v0 model runner. I don't think this is an issue or related to the recent failures of v0 static batching on AIU Spyre, just add it here for completeness. Signed-off-by: Yannick Schnider <[email protected]>

tdoublep requested review from joerunde, sducouedic and yannicks1 March 25, 2025 10:32

Store warmup shapes in config

5b95b15

Signed-off-by: Thomas Parnell <[email protected]>

tdoublep force-pushed the tpa-fix-v0-warmup branch from 9973a7c to 5b95b15 Compare March 25, 2025 11:10

yannicks1 added 2 commits March 25, 2025 14:39

fix formatting

230efde

Signed-off-by: Yannick Schnider <[email protected]>

apply fix to V1 classes

0ff75fc

Signed-off-by: Yannick Schnider <[email protected]>

yannicks1 approved these changes Mar 25, 2025

View reviewed changes

yannicks1 mentioned this pull request Mar 26, 2025

[tests] Add online test #50

Closed

joerunde mentioned this pull request Mar 26, 2025

Add CLI args and config #51

Closed

joerunde added 4 commits March 26, 2025 09:44

✨ always get warmup shapes from env

2520036

Signed-off-by: Joe Runde <[email protected]>

Merge branch 'main' into tpa-fix-v0-warmup

04f032f

🐛 fixup minor merge issues

be25359

Signed-off-by: Joe Runde <[email protected]>

🔥 remove forking requirement

95a1c9e

Signed-off-by: Joe Runde <[email protected]>

tdoublep commented Mar 26, 2025

View reviewed changes

joerunde merged commit 15c0e52 into main Mar 26, 2025
9 checks passed

joerunde deleted the tpa-fix-v0-warmup branch March 26, 2025 18:07

yannicks1 mentioned this pull request Jun 25, 2025

[v0] replace current_platform with SpyrePlatform #263

Merged

Ensure that warmup shapes are available after forking. #47

Ensure that warmup shapes are available after forking. #47

Uh oh!

Conversation

tdoublep commented Mar 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Mar 25, 2025

Uh oh!

maxdebayser commented Mar 25, 2025

Uh oh!

yannicks1 left a comment

Choose a reason for hiding this comment

Uh oh!

tjohnson31415 commented Mar 25, 2025

Uh oh!

tdoublep commented Mar 25, 2025

Uh oh!

tdoublep commented Mar 25, 2025

Uh oh!

tjohnson31415 commented Mar 25, 2025

Uh oh!

joerunde commented Mar 25, 2025

Uh oh!

joerunde commented Mar 25, 2025

Uh oh!

tdoublep commented Mar 26, 2025

Uh oh!

tdoublep commented Mar 26, 2025

Uh oh!

joerunde commented Mar 26, 2025

Uh oh!

tdoublep left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

tdoublep commented Mar 25, 2025 •

edited

Loading