Add model registration and qualification handoff by Thump604 · Pull Request #419 · waybarrios/vllm-mlx

Thump604 · 2026-04-24T15:21:53Z

Stacked on #417. This keeps the first model artifact workflow PR reviewable, then adds the next handoff layer without mutating any local Ops registry.

Scope:

add vllm-mlx model register <artifact> to write a portable vllm_mlx_registration_manifest.json
include artifact inspection, acquisition/conversion source manifests, parser policy, serving defaults, feature flags, preset alias, and explicit production_ready=false
add vllm-mlx model qualify <model-id> to create or run a bench-serve qualification handoff command and optional request manifest
keep qualification as a handoff artifact, not a green-light. Production readiness still requires the standing workload contract and review of the resulting evidence

Local validation:

uv run --extra dev pytest -q tests/test_model_workflow.py tests/test_download.py
# 23 passed

uv run --extra dev black --check vllm_mlx/model_workflow.py vllm_mlx/cli.py tests/test_model_workflow.py
uvx ruff check vllm_mlx/model_workflow.py vllm_mlx/cli.py tests/test_model_workflow.py --select E,F,W --ignore E402,E501,E731,F811,F841
git diff --check
# clean

CLI smokes:

vllm-mlx model qualify qwen-test --workload /tmp/workload.json --dry-run --extra-arg=--tag --extra-arg smoke
vllm-mlx model register <tmp-artifact> --model-id qwen-test --served-model-name qwen-test --preset-alias fast-qwen --mllm --default-temperature 0.6 --default-top-p 0.95 --default-top-k 20 --default-min-p 0.0 --default-presence-penalty 0.0 --default-repetition-penalty 1.0 --default-chat-template-kwargs '{"enable_thinking":true}' --feature-flag prefix_cache

Dependency note: model qualify --workload is intended to compose with #406's workload contract support. Without that bench-serve support, the dry-run manifest is still useful, but running the generated command with --workload depends on #406 or equivalent landing.

Add model registration and qualification handoff

f933a50

Thump604 closed this Apr 24, 2026

Thump604 deleted the codex/model-register-qualify-workflow branch April 24, 2026 15:48

Thump604 mentioned this pull request Apr 24, 2026

Add bench-serve workload contracts #406

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add model registration and qualification handoff#419

Add model registration and qualification handoff#419
Thump604 wants to merge 1 commit intofeat/model-acquisition-workflowfrom
codex/model-register-qualify-workflow

Thump604 commented Apr 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Thump604 commented Apr 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant