Manage supported model configurations #445

ckadner · 2025-09-05T17:50:02Z

Description

Add config file for models and runtime parameters
Use configuration file in documentation
Add validation code to compare requested model and runtime parameters with supported configurations
Log warning when requested configuration is not supported
Example:
WARNING 09-05 18:46:43 [runtime_config_validator.py:107] The requested configuration is not supported for model 'ibm-ai-platform/micro-g3.3-8b-instruct-1b': RuntimeConfiguration(platform=, cb=True, tp_size=1, max_model_len=128, max_num_seqs=2, num_blocks=0, warmup_shapes=None)

TODO:

code cleanup
review/revise model-config YAML file structure
add a YAML field to ignore testing models/configurations for tiny model unit tests
what to use for num_blocks (cpu, gpu ...override)?
revise config validation logic and messaging
- 2 stage config matching ... top level fields first, set containment for warmup_shapes second
update configs after release (candidate) testing
remove option to error out on unknown configuration
match models by config if they are mounted locally
integrate model/runtime configurations into tests (⚗️ draft supported model tests #435)
❓ get_warmup_shapes_from_envs() does not yield same as platform.py:cls._warmup_shapes

Review suggestions:

@maxdebayser

I wonder if it's feasible to test the warm up shapes like this. Maybe we could do something like:

in the know configuration file, [only keep the] upper bound

Validate that the prompts are multiples of 64

Validate that prompt + new_tokens <= max_model_len

Validate that the batch size is <= a tested upper bound.

Related Issues

#435

- Add config file for models and runtime parameters - Add validation code to compare requested model and runtime parameters with supported configurations - Log warning when requested configuration is not supported - Use configuration file in documentation Signed-off-by: Christian Kadner <[email protected]>

Signed-off-by: Christian Kadner <[email protected]>

vllm_spyre/config/supported_configurations.yaml

vllm_spyre/config/runtime_config_validator.py

Signed-off-by: Christian Kadner <[email protected]>

Validate that the batch size is <= a tested upper bound Signed-off-by: Christian Kadner <[email protected]>

Signed-off-by: Christian Kadner <[email protected]>

tests/utils/test_model_config_validator.py

Signed-off-by: Christian Kadner <[email protected]>

vllm_spyre/config/runtime_config_validator.py

Signed-off-by: Christian Kadner <[email protected]>

ckadner · 2025-10-06T19:39:19Z

Hi @maxdebayser I added validation code and unit tests for:

in the know configuration file, [only keep the] upper bound

test [requested warmup_shapes against] upper bound for prompt length, batch size and max_new tokens

sum of prompt + max_new_tokens is smaller than the max_model_len [of supported configs]

prompt size is a multiple of 64

Kindly take another look? Thank you! 🙏🏻

maxdebayser · 2025-10-07T17:05:49Z

@ckadner , I was having trouble to explain my thoughts as review comments, so I put them in code form: ckadner#19 .

maxdebayser · 2025-10-07T17:40:55Z

@ckadner , my assumptions aren't correct. Please disregard some of my previous comments about upper bounds.

vllm_spyre/config/runtime_config_validator.py

maxdebayser

I've left a small suggestion, but otherwise it LGTM

ckadner · 2025-10-08T16:46:37Z

One last item to do:

Detect the model similar to what is done for granite, not rely on name which works for download from HF but not for file mounts (see PR 🐛 implement better checking for granite #500)

Signed-off-by: Christian Kadner <[email protected]>

maxdebayser · 2025-10-14T02:13:39Z

vllm_spyre/config/runtime_config_validator.py

+
+    matching_models = [
+        model for model, config in (known_model_configs or {}).items()
+        if config.items() <= requested_config.items()


maxdebayser

The verification part of the code looks good to me and I like some of the clever tricks used for matching configurations. I have just one question: what is the motivation for maintaining the pre-downloaded config.json files? Is it to avoid downloads during testing or perhaps to make sure that the configurations are not updated remotely?

Because if you do something like

from transformers import AutoConfig
c = AutoConfig.from_reptrained("peiyi9979/math-shepherd-mistral-7b-prm")

the config.json will be downloaded automatically and cached in the local huggingface cache. And it will also use the config.json of already downloaded models.

ckadner · 2025-10-14T22:47:16Z

[...] question: what is the motivation for maintaining the pre-downloaded config.json files? Is it to avoid downloads during testing or perhaps to make sure that the configurations are not updated remotely?

I wanted to have one unit test which verifies that we can consistently match models by their config. So, in that unit test, I need to instantiate a model config for each of our supported models and then use those to verify my runtime configuration validation code can reliably "guess" the correct model. Which is needed to actually verify the runtime configurations.

Since we run our unit tests in "offline" mode in our GitHub Actions tests, we need to have those configs available before test execution.

We could add a pre-step in our GHA test workflow to download those configs (from HF or GHA cache) instead of keeping them in our unit tests folder. But @joerunde had already set that precedent of keeping 2 config.json files, and I like following precedent :-)

maxdebayser

Thanks, @ckadner , that makes sense to me.

ckadner added 6 commits September 5, 2025 09:15

Reorganize import statements

1b80333

Signed-off-by: Christian Kadner <[email protected]>

Lint docs

3d8ad48

Signed-off-by: Christian Kadner <[email protected]>

use 'x86_64' instead of 'amd64'

c8ce7da

Signed-off-by: Christian Kadner <[email protected]>

typecheck updates

4decb48

Signed-off-by: Christian Kadner <[email protected]>

more typecheck updates

09bde9e

Signed-off-by: Christian Kadner <[email protected]>

ckadner requested a review from joerunde September 5, 2025 17:50

ckadner requested review from nikolaospapandreou, rafvasq, sducouedic, tdoublep and yannicks1 as code owners September 5, 2025 17:50

ckadner added 6 commits September 5, 2025 11:10

run isort with suggested changes

564d9b0

Signed-off-by: Christian Kadner <[email protected]>

reorganize imports as isort wants them

4223ddf

Signed-off-by: Christian Kadner <[email protected]>

CI: isort show suggested import changes

dd6bd49

Signed-off-by: Christian Kadner <[email protected]>

update comments in config YAML

ca46ba8

Signed-off-by: Christian Kadner <[email protected]>

yapf

214a5ce

Signed-off-by: Christian Kadner <[email protected]>

run type-check with Python 3.10 by default

d67d7b1

Signed-off-by: Christian Kadner <[email protected]>

ckadner requested a review from prashantgupta24 as a code owner September 5, 2025 18:44

ckadner marked this pull request as draft September 5, 2025 19:01

vllm-project deleted a comment from github-actions bot Sep 5, 2025

ckadner added 3 commits September 5, 2025 14:55

revert unrelated changes

03c9c76

Signed-off-by: Christian Kadner <[email protected]>

Merge branch 'main' into model_configs

5a67e9a

Merge branch 'main' into model_configs

6995cad

Signed-off-by: Christian Kadner <[email protected]>

joerunde reviewed Sep 15, 2025

View reviewed changes

vllm_spyre/config/supported_configurations.yaml Outdated Show resolved Hide resolved

joerunde reviewed Sep 15, 2025

View reviewed changes

vllm_spyre/config/supported_configurations.yaml Outdated Show resolved Hide resolved

joerunde reviewed Sep 15, 2025

View reviewed changes

vllm_spyre/config/runtime_config_validator.py Outdated Show resolved Hide resolved

joerunde reviewed Sep 15, 2025

View reviewed changes

vllm_spyre/config/runtime_config_validator.py Outdated Show resolved Hide resolved

maxdebayser reviewed Sep 17, 2025

View reviewed changes

vllm_spyre/config/runtime_config_validator.py Outdated Show resolved Hide resolved

address review comments, add tests

12bb213

Signed-off-by: Christian Kadner <[email protected]>

ckadner added 4 commits September 29, 2025 16:29

Merge branch 'main' into model_configs

b2f8649

Signed-off-by: Christian Kadner <[email protected]>

assert c.warmup_shapes is None if use_cb

cbb7a1b

Signed-off-by: Christian Kadner <[email protected]>

update list of supported models

cc0a393

Signed-off-by: Christian Kadner <[email protected]>

requested config <= supported config

3f48a91

Signed-off-by: Christian Kadner <[email protected]>

ckadner marked this pull request as draft September 30, 2025 19:48

ckadner added 2 commits September 30, 2025 13:55

Validate that prompt + new_tokens <= max_model_len

c65aa9e

Validate that the batch size is <= a tested upper bound Signed-off-by: Christian Kadner <[email protected]>

type-check

437290a

Signed-off-by: Christian Kadner <[email protected]>

ckadner marked this pull request as ready for review September 30, 2025 21:26

ckadner changed the title ~~WIP: Manage supported model configurations~~ Manage supported model configurations Sep 30, 2025

joerunde reviewed Sep 30, 2025

View reviewed changes

tests/utils/test_model_config_validator.py Show resolved Hide resolved

ckadner requested review from joerunde and maxdebayser September 30, 2025 21:31

remove option to error out on unsupported/unknown configuration

4f7a804

Signed-off-by: Christian Kadner <[email protected]>

rafvasq reviewed Oct 3, 2025

View reviewed changes

vllm_spyre/config/runtime_config_validator.py Outdated Show resolved Hide resolved

vllm_spyre/config/runtime_config_validator.py Outdated Show resolved Hide resolved

vllm_spyre/config/runtime_config_validator.py Outdated Show resolved Hide resolved

ckadner added 2 commits October 3, 2025 20:01

remove configurations that are within the upper bound of another config

3970d84

Signed-off-by: Christian Kadner <[email protected]>

verify config parameters adhere to restrictions

05405f4

Signed-off-by: Christian Kadner <[email protected]>

maxdebayser reviewed Oct 7, 2025

View reviewed changes

vllm_spyre/config/runtime_config_validator.py Show resolved Hide resolved

maxdebayser approved these changes Oct 7, 2025

View reviewed changes

ckadner added 2 commits October 8, 2025 14:25

Merge branch 'main' into model_configs

82ccf41

determine model from HF-config (config.json)

1caae76

Signed-off-by: Christian Kadner <[email protected]>

maxdebayser reviewed Oct 14, 2025

View reviewed changes

maxdebayser approved these changes Oct 15, 2025

View reviewed changes

Merge branch 'main' into model_configs

7f254cf

ckadner merged commit f6a83ce into vllm-project:main Oct 15, 2025
19 checks passed

Manage supported model configurations #445

Manage supported model configurations #445

Conversation

ckadner commented Sep 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Related Issues

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ckadner commented Oct 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

maxdebayser commented Oct 7, 2025

Uh oh!

maxdebayser commented Oct 7, 2025

Uh oh!

Uh oh!

maxdebayser left a comment

Choose a reason for hiding this comment

Uh oh!

ckadner commented Oct 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

maxdebayser Oct 14, 2025

Choose a reason for hiding this comment

Uh oh!

maxdebayser left a comment

Choose a reason for hiding this comment

Uh oh!

ckadner commented Oct 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

maxdebayser left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

ckadner commented Sep 5, 2025 •

edited

Loading

ckadner commented Oct 6, 2025 •

edited

Loading

ckadner commented Oct 8, 2025 •

edited

Loading

ckadner commented Oct 14, 2025 •

edited

Loading