Skip to content

Add model registration and qualification handoff#419

Closed
Thump604 wants to merge 1 commit intofeat/model-acquisition-workflowfrom
codex/model-register-qualify-workflow
Closed

Add model registration and qualification handoff#419
Thump604 wants to merge 1 commit intofeat/model-acquisition-workflowfrom
codex/model-register-qualify-workflow

Conversation

@Thump604
Copy link
Copy Markdown
Collaborator

Stacked on #417. This keeps the first model artifact workflow PR reviewable, then adds the next handoff layer without mutating any local Ops registry.

Scope:

  • add vllm-mlx model register <artifact> to write a portable vllm_mlx_registration_manifest.json
  • include artifact inspection, acquisition/conversion source manifests, parser policy, serving defaults, feature flags, preset alias, and explicit production_ready=false
  • add vllm-mlx model qualify <model-id> to create or run a bench-serve qualification handoff command and optional request manifest
  • keep qualification as a handoff artifact, not a green-light. Production readiness still requires the standing workload contract and review of the resulting evidence

Local validation:

uv run --extra dev pytest -q tests/test_model_workflow.py tests/test_download.py
# 23 passed

uv run --extra dev black --check vllm_mlx/model_workflow.py vllm_mlx/cli.py tests/test_model_workflow.py
uvx ruff check vllm_mlx/model_workflow.py vllm_mlx/cli.py tests/test_model_workflow.py --select E,F,W --ignore E402,E501,E731,F811,F841
git diff --check
# clean

CLI smokes:

vllm-mlx model qualify qwen-test --workload /tmp/workload.json --dry-run --extra-arg=--tag --extra-arg smoke
vllm-mlx model register <tmp-artifact> --model-id qwen-test --served-model-name qwen-test --preset-alias fast-qwen --mllm --default-temperature 0.6 --default-top-p 0.95 --default-top-k 20 --default-min-p 0.0 --default-presence-penalty 0.0 --default-repetition-penalty 1.0 --default-chat-template-kwargs '{"enable_thinking":true}' --feature-flag prefix_cache

Dependency note: model qualify --workload is intended to compose with #406's workload contract support. Without that bench-serve support, the dry-run manifest is still useful, but running the generated command with --workload depends on #406 or equivalent landing.

@Thump604 Thump604 closed this Apr 24, 2026
@Thump604 Thump604 deleted the codex/model-register-qualify-workflow branch April 24, 2026 15:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant