Skip to content

Conversation

yannicks1
Copy link
Collaborator

update openai_spyre_inference.py

changes:

  • use model 'ibm-granite/granite-3.3-8b-instruct'
  • use tp 4
  • remove block-size argument as this gets overridden anyway (actually 2048 throws an error as not supported upstream)
  • expand (and explain) server startup instructions for continuous batching

@yannicks1 yannicks1 marked this pull request as ready for review October 10, 2025 15:12
Copy link

👋 Hi! Thank you for contributing to vLLM support on Spyre.
Just a reminder: Make sure that your code passes all the linting checks, otherwise your PR won't be able to be merged. To do so, first install the linting requirements, then run format.sh and commit the changes. This can be done with uv directly:

uv sync --frozen --group lint --active --inexact

Or this can be done with pip:

uv pip compile --group lint > requirements-lint.txt
pip install -r requirements-lint.txt
bash format.sh

Now you are good to go 🚀

Signed-off-by: Yannick Schnider <[email protected]>
Continuous Batching:
First, start the server with the following command:
VLLM_SPYRE_USE_CB=1 python3 -m vllm.entrypoints.openai.api_server \
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we use vllm serve here instead? Or python3 -m vllm.entrypoints.cli.main serve?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Copy link
Collaborator

@maxdebayser maxdebayser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

Signed-off-by: Yannick Schnider <[email protected]>
@yannicks1 yannicks1 merged commit beae866 into main Oct 13, 2025
18 checks passed
@yannicks1 yannicks1 deleted the ysc-update-online-example branch October 13, 2025 09:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants