🐛 Fix fp8 model name check with quantization check #535

gkumbhat · 2025-10-16T21:30:52Z

Description

Replace FP8 model name check with quantization check since user could choose to change their model name and then things can go into weird state, and if they are trying to use pre-compiled model, then it will go into re-compilation

Related Issues

Signed-off-by: Gaurav-Kumbhat <[email protected]>

github-actions · 2025-10-16T21:31:00Z

👋 Hi! Thank you for contributing to vLLM support on Spyre.
Just a reminder: Make sure that your code passes all the linting checks, otherwise your PR won't be able to be merged. To do so, first install the linting requirements, then run format.sh and commit the changes. This can be done with uv directly:

uv sync --frozen --group lint --active --inexact

Or this can be done with pip:

uv pip compile --group lint > requirements-lint.txt
pip install -r requirements-lint.txt
bash format.sh

Now you are good to go 🚀

prashantgupta24

lgtm

ckadner · 2025-10-17T00:10:33Z

vllm_spyre/v1/worker/spyre_worker.py


        # TODO: we need 2 requests for warmup on FP8+CB
-        is_fp8_plus_cb = 'FP8' in self.model_config.model and \
+        # Check if model is quantized


What about other quantizations like 4 bit or 8 bit int?

There is some more code from @joerunde and some I recently wrote to do this kind of check, which could be adapted and used here:

vllm-spyre/vllm_spyre/platform.py

Line 597 in de0e3d2

def is_granite_3_8b(cls, model_config: ModelConfig):

vllm-spyre/vllm_spyre/config/runtime_config_validator.py

Line 139 in de0e3d2

def find_known_models_by_model_config(model_config: ModelConfig) -> list[str]:

We'll have to refactor all these checks once we support more quantization methods, today we only support fp8. I wouldn't mind a refactor to at least pull all of these fp8 checks into one helper instance method in the model class, but we don't need to block this fix on it

I do think this is a separate problem than figuring out which model we're serving though, because any fp8 model has to be handled separately here, not just granite specifically.

Created a follow-up issue: #537

joerunde

Lgtm too

joerunde · 2025-10-17T00:31:41Z

bot:test
MARKERS="spyre and quantized and not multi"

joerunde · 2025-10-17T00:32:50Z

I don't believe in the bot tests any more but we might as well try to kick one off to make sure nothing barfs in a new and unusual way

🐛 Fix fp8 model name check with quantization check

6f6ae9b

Signed-off-by: Gaurav-Kumbhat <[email protected]>

gkumbhat marked this pull request as ready for review October 16, 2025 21:33

gkumbhat requested review from nikolaospapandreou, sducouedic, tdoublep and yannicks1 as code owners October 16, 2025 21:33

prashantgupta24 approved these changes Oct 16, 2025

View reviewed changes

ckadner reviewed Oct 17, 2025

View reviewed changes

joerunde approved these changes Oct 17, 2025

View reviewed changes

joerunde merged commit 3e35a3a into vllm-project:main Oct 17, 2025
21 checks passed

ckadner mentioned this pull request Oct 17, 2025

Refactor checks for quantization methods #537

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

🐛 Fix fp8 model name check with quantization check #535

🐛 Fix fp8 model name check with quantization check #535

Uh oh!

gkumbhat commented Oct 16, 2025

Uh oh!

github-actions bot commented Oct 16, 2025

Uh oh!

prashantgupta24 left a comment

Uh oh!

ckadner Oct 17, 2025

Uh oh!

joerunde Oct 17, 2025

Uh oh!

ckadner Oct 17, 2025

Uh oh!

joerunde left a comment

Uh oh!

joerunde commented Oct 17, 2025

Uh oh!

joerunde commented Oct 17, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

🐛 Fix fp8 model name check with quantization check #535

🐛 Fix fp8 model name check with quantization check #535

Uh oh!

Conversation

gkumbhat commented Oct 16, 2025

Description

Related Issues

Uh oh!

github-actions bot commented Oct 16, 2025

Uh oh!

prashantgupta24 left a comment

Choose a reason for hiding this comment

Uh oh!

ckadner Oct 17, 2025

Choose a reason for hiding this comment

Uh oh!

joerunde Oct 17, 2025

Choose a reason for hiding this comment

Uh oh!

ckadner Oct 17, 2025

Choose a reason for hiding this comment

Uh oh!

joerunde left a comment

Choose a reason for hiding this comment

Uh oh!

joerunde commented Oct 17, 2025

Uh oh!

joerunde commented Oct 17, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants