-
Notifications
You must be signed in to change notification settings - Fork 1.2k
ci: Add vLLM support to integration testing infrastructure (with qwen) #3545
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
e07b4bb
to
746e9c9
Compare
}, | ||
defaults={ | ||
"text_model": "vllm/meta-llama/Llama-3.2-1B-Instruct", | ||
"text_model": "vllm/Qwen/Qwen3-0.6B", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@derekhiggins anything blocking with this PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I need to fix a problem in the record mechanism that arouse over the last few days, Hopefully will have it working again later today.
746e9c9
to
5468bab
Compare
88e722c
to
1e995fe
Compare
@@ -168,6 +168,11 @@ class Setup(BaseModel): | |||
roots=base_roots, | |||
default_setup="ollama", | |||
), | |||
"base-vllm-subset": Suite( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this needed anymore?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My intent here was to add this job with only the tests in "tests/integration/inference" and then once we're happy we haven't cause any major disruption we could expand to the entire suit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@derekhiggins you feel like this is ready?
feb3ec9
to
1c3ea50
Compare
Signed-off-by: Derek Higgins <[email protected]>
It preforms better in tool calling and structured tests Signed-off-by: Derek Higgins <[email protected]>
Add vLLM provider support to integration test CI workflows alongside existing Ollama support. Configure provider-specific test execution where vLLM runs only inference specific tests (excluding vision tests) while Ollama continues to run the full test suite. This enables comprehensive CI testing of both inference providers but keeps the vLLM footprint small, this can be expanded later if it proves to not be too disruptive. Also updated test skips that were marked with "inline::vllm", this should be "remote::vllm". This causes some failing log probs tests to be skipped and should be revisted. Signed-off-by: Derek Higgins <[email protected]>
1c3ea50
to
0f0d986
Compare
o Introduces vLLM provider support to the record/replay testing framework
o Enabling both recording and replay of vLLM API interactions alongside existing Ollama support.
The changes enable testing of vLLM functionality. vLLM tests focus on
inference capabilities, while Ollama continues to exercise the full API surface
including vision features.
--
This is an alternative to #3128 , using qwen3 instead of llama 3.2 1B appears to be more capable at structure output and tool calls.