Skip to content

Repin vllm and inspect_evals so the Dockerfile builds#36

Open
surelyMersad wants to merge 1 commit into
aisa-group:add_harbor_supportfrom
surelyMersad:harbor-dockerfile-fix
Open

Repin vllm and inspect_evals so the Dockerfile builds#36
surelyMersad wants to merge 1 commit into
aisa-group:add_harbor_supportfrom
surelyMersad:harbor-dockerfile-fix

Conversation

@surelyMersad
Copy link
Copy Markdown

Two upstream-shifted dependencies were preventing the harbor_adapter Dockerfile from building on Modal:

  • vllm==0.11.0 requires xformers==0.0.32.post1, which is no longer on PyPI for manylinux_x86_64. Repinned to 0.19.1, which builds and runs end-to-end on Modal.

  • inspect_evals was cloned --depth=1 from main, but main HEAD now requires Python>=3.11 while the image installs python3.10. Switched to uv pip install "inspect_evals @ git+...@<sha>" pinned to commit 03cb4bc2 (2026-03-15), the last commit on main still declaring requires-python = ">=3.10". Also removes the manual git clone step.

Also adds local-only artifacts to .gitignore so they don't sneak in.

Test plan

  • python run_adapter.py --benchmark gsm8k --model qwen3-1.7b --output ./tasks generates without
    error
  • harbor run -c <task>/job.yaml --agent nop --env modal --yes builds the image successfully on
    Modal — verified twice in this branch (~8 min build with warm cache, ~15 min cold), no resolver errors
    on either the vllm or inspect_evals install steps. Verifier writes reward.txt and metrics.json
    cleanly.

Two upstream-shifted dependencies were preventing the harbor_adapter
Dockerfile from building on Modal:

- vllm==0.11.0 requires xformers==0.0.32.post1, which is no longer on
  PyPI for manylinux_x86_64. Repinned to 0.19.1, which builds and
  runs end-to-end on Modal.

- inspect_evals was cloned --depth=1 from main, but main HEAD now
  requires Python>=3.11 while the image installs python3.10. Switched
  to `uv pip install "inspect_evals @ git+...@<sha>"` pinned to commit
  03cb4bc2 (2026-03-15), the last commit on main still declaring
  requires-python = ">=3.10". Also removes the manual git clone step.

Also adds local-only artifacts to .gitignore so they don't sneak in.
@hrdkbhatnagar
Copy link
Copy Markdown
Collaborator

Thanks for catching the build break, but we can't take the vllm bump as-is. We used vllm 0.11.0 for the original PTB leaderboard runs, and any inference version change risks decoding level divergence from those baselines, which is a parity claim we need to defend in the paper. Same logic for the rest of the ML stack.

Already pushed an alternate fix to add_harbor_support that mirrors containers/opus_4_6_1m.def from the upstream repo (the def file we used to generate the leaderboard):

  • vllm pinned to 0.11.0
  • ML deps from requirements-direct.txt
  • flash-attn 2.8.3
  • inspect_ai_vllm_stdout fork
  • --torch-backend=cu128 (Modal's build VM has no nvidia-smi, so cuda autodetect resolves to CPU torch and breaks vllm's xformers requirement)

Builds on Modal end to end. Thus the inspect_evals switch to a pinned commit would also not be needed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants