fix(ci): SLSA provenance must attest pre-built release artifacts#54
Conversation
The slsa-provenance.yml workflow was running `uv build` to regenerate dist/ inside the provenance job, then hashing those fresh artifacts. Because Python builds are not bit-for-bit reproducible, the resulting SLSA attestation described files that diverged from what the upstream Semantic Release job actually published to the GitHub Release. The chain of custody was meaningless. Replace the local build with artifact retrieval: - workflow_run trigger: download the `release-dist` GHA artifact from the upstream Semantic Release run (run-id from github.event.workflow_run.id) using actions/download-artifact v8.0.1. - workflow_dispatch trigger: download the wheels and sdists attached to the matching `v<version>` GitHub Release via `gh release download`. The provenance now attests the bytes that were actually shipped, which is the entire point of SLSA L3 supply-chain attestation. Adds a guard that fails the run when dist/ contains neither a wheel nor an sdist so the SLSA generator never emits an empty provenance. Refs: ~/.claude/projects/-home-byron-dev--claude/memory/feedback_slsa_provenance_pattern.md
|
Caution Review failedPull request was closed or merged during review WalkthroughThe SLSA Provenance workflow was restructured to generate SLSA Level 3 provenance from published release artifacts instead of building and hashing in-workflow. Manual invocation now uses an optional ChangesSLSA Provenance Workflow Refactoring
Sequence DiagramsequenceDiagram
participant SemanticRelease as Semantic Release Workflow
participant HashJob as Hash Job
participant GitHubAPI as GitHub API
participant SLSAGen as SLSA Generator
participant Artifacts as Release Artifacts
SemanticRelease->>Artifacts: publish release-dist (*.whl, *.tar.gz)
HashJob->>GitHubAPI: resolve upstream run (prefer run_id)
GitHubAPI-->>HashJob: run metadata & validation
HashJob->>Artifacts: download release-dist
Artifacts-->>HashJob: distribution files
HashJob->>GitHubAPI: resolve release tag from SHA
GitHubAPI-->>HashJob: tag name
HashJob->>HashJob: compute SHA256 hashes
HashJob->>SLSAGen: pass hashes & tag
SLSAGen->>GitHubAPI: request id-token (OIDC)
GitHubAPI-->>SLSAGen: id-token
SLSAGen->>SLSAGen: generate intoto attestation
SLSAGen->>Artifacts: upload multiple.intoto.jsonl
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Suggested labels
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Dependency Review✅ No vulnerabilities or license issues or OpenSSF Scorecard issues found.OpenSSF Scorecard
Scanned Files
|
There was a problem hiding this comment.
Pull request overview
This PR updates the SLSA provenance workflow to attest the exact pre-built release artifacts (downloaded from the upstream release workflow run or from an existing GitHub Release), instead of rebuilding packages inside the provenance job (which can produce non-reproducible bytes and invalidate the attestation).
Changes:
- Replaces
uv buildwith artifact retrieval:workflow_run: downloads therelease-distartifact from the triggering “Semantic Release” run.workflow_dispatch: downloads wheels/sdists from the matchingv<version>GitHub Release viagh.
- Tightens permissions to workflow-level
permissions: {}with job-scoped permissions. - Adds guards/validation to prevent generating provenance when
dist/is empty and to validate the manualversioninput format.
| if ! [[ "$INPUT_VERSION" =~ ^[A-Za-z0-9._+-]+$ ]]; then | ||
| echo "::error::Invalid version input: must match ^[A-Za-z0-9._+-]+$" | ||
| exit 1 | ||
| fi | ||
| TAG="v$INPUT_VERSION" | ||
| mkdir -p dist | ||
| # Pull wheels and sdists attached to the release. |
There was a problem hiding this comment.
Resolved by williaby in 1f4e9da: the workflow_dispatch input changed from a version string to an integer-validated run_id, so the TAG="v$INPUT_VERSION" concatenation no longer exists. The tag is now resolved by querying the GitHub Releases API for the head SHA.
| with: | ||
| subject-path: 'dist/*' | ||
| set -euo pipefail | ||
| HASHES=$(sha256sum ./* | base64 -w0) |
There was a problem hiding this comment.
Resolved by williaby in 1f4e9da: the hashing step now uses cd dist && sha256sum -- *.whl *.tar.gz | sort | base64 -w0, restoring bare-name subjects, plus adds a flag-injection guard (--), explicit globs scoped to wheel/sdist, and deterministic ordering.
The previous revision still routed through `ByronWilliamsCPA/.github/.github/workflows/python-slsa.yml`, which cannot work as a reusable workflow: 1. The template defines only `on: workflow_dispatch:`, so it has no `workflow_call:` entry-point and `uses:` cannot invoke it. 2. The template itself calls `slsa-framework/slsa-github-generator/ .github/workflows/generator_generic_slsa3.yml`, which is a reusable workflow. GitHub Actions forbids nested reusable workflow calls. 3. The template's own header docs say: "This is a TEMPLATE, not a reusable workflow ... You must copy the 'provenance' job directly into your release workflow." CI on PR #54 still went green because the PR validation gates only exercise workflow file syntax; the actual `workflow_run` -> SLSA provenance flow fires only after a real Semantic Release run and therefore was never exercised on the PR. Switch to the inline pattern used by the working exemplar `ByronWilliamsCPA/homelab-infra` PR #435: - `hash` job downloads the upstream `release-dist` artifact via `actions/download-artifact@v8.0.1` using `github.event.workflow_run.id` (or a `run_id` dispatch input). - Hashes are computed only over `*.whl` and `*.tar.gz` and sorted for determinism before base64 encoding. - A tag resolver picks the tag created by Semantic Release at the head SHA, falling back to the most recent release. - `provenance` job calls the official SLSA generator `slsa-framework/slsa-github-generator/.github/workflows/ generator_generic_slsa3.yml@v2.1.0` directly with `base64-subjects`, `upload-assets: true`, and `upload-tag-name: ${{ needs.hash.outputs.tag }}`. The dispatch input changes from `version` (string) to `run_id` (integer) to match the artifact-download model: provenance is now keyed off a specific Semantic Release run, not a version string. Refs: ByronWilliamsCPA/homelab-infra#435
PR ReviewReviewed at HEAD The architectural pivot to inline the SLSA generator is correct: GitHub Actions forbids nested reusable-workflow calls, and the org CodeRabbit hit the organization PR-review rate limit and did not deliver a review. Important
Suggested
🤖 Generated with Claude Code |
Two review-driven hardenings to slsa-provenance.yml: 1. Verify that the supplied run_id belongs to a successful Semantic Release workflow run before downloading and attesting its artifacts. The workflow_run trigger filter already restricts the automated path, but the workflow_dispatch path accepted any positive integer and would attest whatever release-dist artifact existed on that run. A user with Actions:write could mint a fraudulent SLSA attestation describing substitute bytes. 2. Pass HEAD_BRANCH and HEAD_SHA to the jq filter via --arg rather than shell interpolation into the filter string. Not exploitable today (workflow_run constrains both values), but --arg is the standard pattern and removes the entire injection class. Refs PR #54 review (SEC-005 Important, SEC-007 Suggested). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
PR Fix SummaryPushed Addressed (1 Important + 1 Suggested from the /pr-review pass):
Replied to stale review threads:
Deliberately not addressed:
Pre-commit passing locally (yaml-check, actionlint, em-dash check, trufflehog, github-workflows validation). 🤖 Generated with Claude Code |
|



Problem
slsa-provenance.ymlranuv buildinside the provenance job, then hashed those freshly-built artifacts. Because Python builds are not bit-for-bit reproducible, the attestation described files that diverged from whatever the upstream Semantic Release run actually published to the GitHub Release (and would diverge from PyPI uploads oncepublish-to-pypiis enabled). The supply-chain guarantee was meaningless.This is the recurring pattern documented at
~/.claude/projects/-home-byron-dev--claude/memory/feedback_slsa_provenance_pattern.md: a SLSA provenance job must hash the exact files that shipped, never a fresh local rebuild.Fix
Replace the local
uv buildwith artifact retrieval, branching on the trigger:workflow_run: download therelease-distGitHub Actions artifact (uploaded bypython-release.yml, 5-day retention) from the upstream Semantic Release run viaactions/download-artifact@v8.0.1withrun-id: github.event.workflow_run.id.workflow_dispatch: download the wheels and sdists attached to the matchingv<version>GitHub Release usinggh release download(covers re-runs after GHA artifact retention has lapsed).The provenance now attests the same bytes that landed in the release.
Safety
dist/contains neither a wheel nor an sdist so the SLSA generator never emits empty provenance.permissions: {}with job-scopedcontents: read,actions: readfor the collect job; theslsajob keepsid-token: write+contents: write+actions: readfor the org reusable.INPUT_VERSIONvalidated against^[A-Za-z0-9._+-]+$before interpolation.Verification
actionlintclean.origin/mainin a per-repo worktree at.worktrees/slsa-provenance-fix.Refs: feedback_slsa_provenance_pattern.md
Summary by CodeRabbit