-
Notifications
You must be signed in to change notification settings - Fork 26
Ensure that warmup shapes are available after forking. #47
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
👋 Hi! Thank you for contributing to vLLM support on Spyre. Now you are good to go 🚀 |
Signed-off-by: Thomas Parnell <[email protected]>
9973a7c to
5b95b15
Compare
|
So the warmup shapes are being parsed in the main server process while the first worker process that also contains the engine in V0 is already up and running? |
Signed-off-by: Yannick Schnider <[email protected]>
Signed-off-by: Yannick Schnider <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just fixed formatting and added the same logic to V1 classes. LGTM!
No, the worker process comes up after the main server process parses the warmup shapes, but the shapes aren't re-parsed in the worker. This problem is related to The default of Moving the warmup shapes to the |
|
@tjohnson31415 thank you for the correction + detailed explanation! I wrongly assumed that the default V0 behaviour was fork too. I will try it quickly now on main using fork. Either way, I still think we should consider moving the shapes into the config. In fact, I wonder why we are really using env variables at all. Wouldn't we rather be passing them as proper arguments? If vLLM does not provide a way for plugins to add their own custom args/config then perhaps this is something we could change upstream? |
|
Hmm, I get the same error even if I set: |
|
Hmm, yeah, then this may be something else, not the serialization problem that I'm familiar with 🤔 |
+1 on bringing that up upstream. I'd rather not have to do the entrypoint hijacking that the vllm tgis adapter does to add cli args. For this PR though, do we need to store the warmup shapes on a config object at all? It seems like they can be rebuilt in each worker from the environment variables, and I'd prefer doing that now for robustness until we work out some upstream changes for plugins to be able to add their own arguments and config properly. |
|
Also.... we should probably have at least a single test that runs the openai server so we can catch these problems too! |
Agreed. @dpatel-ops and I had discussed this before. |
@joerunde We can do it like that for now, yeah. I had an earlier attempt where I was doing that, just felt that cleaning in the config was cleaner since they are kept in "one place". It is not a huge difference though tbh. |
|
Added a followup issue to correctly handle cli args and configs here: #51 |
Signed-off-by: Joe Runde <[email protected]>
Signed-off-by: Joe Runde <[email protected]>
Signed-off-by: Joe Runde <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM - thanks!
### [v0] replace current_platform with SpyrePlatform PR #47 missed replacing current_platform with SpyrePlatform in the v0 model runner. I don't think this is an issue or related to the recent failures of v0 static batching on AIU Spyre, just add it here for completeness. Signed-off-by: Yannick Schnider <[email protected]>
I need to do a demo using V0 online serving today, and I noticed that it is currently not working using latest
aiu-vllm-devimage:Trying to deploy the following:
produces:
It looks like the engine process is getting forked after we parse the warmup shapes in the main process, and thus in the engine process they don't exist.
It doesn't look like the platform is really the best place to store "state". Why don't we just use the config instead? I understand it is a little bit nasty since we can't "change" that class in the plugin, but it works very nicely.