Skip to content

fix(ci): SLSA provenance must attest pre-built release artifacts#54

Merged
williaby merged 3 commits into
mainfrom
fix/slsa-provenance-download-artifacts
May 26, 2026
Merged

fix(ci): SLSA provenance must attest pre-built release artifacts#54
williaby merged 3 commits into
mainfrom
fix/slsa-provenance-download-artifacts

Conversation

@williaby
Copy link
Copy Markdown
Contributor

@williaby williaby commented May 26, 2026

Problem

slsa-provenance.yml ran uv build inside the provenance job, then hashed those freshly-built artifacts. Because Python builds are not bit-for-bit reproducible, the attestation described files that diverged from whatever the upstream Semantic Release run actually published to the GitHub Release (and would diverge from PyPI uploads once publish-to-pypi is enabled). The supply-chain guarantee was meaningless.

This is the recurring pattern documented at ~/.claude/projects/-home-byron-dev--claude/memory/feedback_slsa_provenance_pattern.md: a SLSA provenance job must hash the exact files that shipped, never a fresh local rebuild.

Fix

Replace the local uv build with artifact retrieval, branching on the trigger:

  • workflow_run: download the release-dist GitHub Actions artifact (uploaded by python-release.yml, 5-day retention) from the upstream Semantic Release run via actions/download-artifact@v8.0.1 with run-id: github.event.workflow_run.id.
  • workflow_dispatch: download the wheels and sdists attached to the matching v<version> GitHub Release using gh release download (covers re-runs after GHA artifact retention has lapsed).

The provenance now attests the same bytes that landed in the release.

Safety

  • Added a guard that fails the job when dist/ contains neither a wheel nor an sdist so the SLSA generator never emits empty provenance.
  • Workflow-level permissions: {} with job-scoped contents: read, actions: read for the collect job; the slsa job keeps id-token: write + contents: write + actions: read for the org reusable.
  • INPUT_VERSION validated against ^[A-Za-z0-9._+-]+$ before interpolation.

Verification

  • actionlint clean.
  • No em-dashes (PC-011 compliant).
  • Branch built from origin/main in a per-repo worktree at .worktrees/slsa-provenance-fix.
  • CI on this PR.

Refs: feedback_slsa_provenance_pattern.md

Summary by CodeRabbit

  • Chores
    • Updated release provenance workflow to generate SLSA Level 3 provenance from published artifacts.
    • Improved security and reliability of release artifact verification.

Review Change Stack

The slsa-provenance.yml workflow was running `uv build` to regenerate
dist/ inside the provenance job, then hashing those fresh artifacts.
Because Python builds are not bit-for-bit reproducible, the resulting
SLSA attestation described files that diverged from what the upstream
Semantic Release job actually published to the GitHub Release. The
chain of custody was meaningless.

Replace the local build with artifact retrieval:

- workflow_run trigger: download the `release-dist` GHA artifact from
  the upstream Semantic Release run (run-id from
  github.event.workflow_run.id) using actions/download-artifact v8.0.1.
- workflow_dispatch trigger: download the wheels and sdists attached to
  the matching `v<version>` GitHub Release via `gh release download`.

The provenance now attests the bytes that were actually shipped, which
is the entire point of SLSA L3 supply-chain attestation. Adds a guard
that fails the run when dist/ contains neither a wheel nor an sdist so
the SLSA generator never emits an empty provenance.

Refs: ~/.claude/projects/-home-byron-dev--claude/memory/feedback_slsa_provenance_pattern.md
Copilot AI review requested due to automatic review settings May 26, 2026 02:56
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 26, 2026

Caution

Review failed

Pull request was closed or merged during review

Walkthrough

The SLSA Provenance workflow was restructured to generate SLSA Level 3 provenance from published release artifacts instead of building and hashing in-workflow. Manual invocation now uses an optional run_id input. A new hash job validates the upstream Semantic Release run, downloads release artifacts, and computes deterministic hashes. The provenance job then calls the official SLSA3 generator inline with those hashes to produce standardized attestations.

Changes

SLSA Provenance Workflow Refactoring

Layer / File(s) Summary
Workflow Trigger and Contract Updates
.github/workflows/slsa-provenance.yml
Workflow metadata, triggers, and job interface updated: manual invocation contract changes from required version input to optional run_id; new hash job output interface exposes hashes and tag for downstream provenance generation.
Artifact Download and Hashing Pipeline
.github/workflows/slsa-provenance.yml
Hash job validates upstream Semantic Release run (prefers explicit run_id when provided, otherwise uses triggering run), hard-fails on non-successful or non-Semantic Release runs, downloads release-dist artifact, errors on empty artifacts, resolves release tag via GitHub API, and computes deterministic base64-encoded SHA256 hashes for published *.whl and *.tar.gz files only.
Inline SLSA3 Provenance Generation
.github/workflows/slsa-provenance.yml
Provenance job calls official slsa-github-generator generic SLSA3 generator inline, wiring computed hashes as subjects, enabling artifact upload, attaching resolved tag via upload-tag-name, setting provenance output name to multiple.intoto.jsonl, and using id-token permissions for OIDC-based token issuance.

Sequence Diagram

sequenceDiagram
  participant SemanticRelease as Semantic Release Workflow
  participant HashJob as Hash Job
  participant GitHubAPI as GitHub API
  participant SLSAGen as SLSA Generator
  participant Artifacts as Release Artifacts
  
  SemanticRelease->>Artifacts: publish release-dist (*.whl, *.tar.gz)
  HashJob->>GitHubAPI: resolve upstream run (prefer run_id)
  GitHubAPI-->>HashJob: run metadata & validation
  HashJob->>Artifacts: download release-dist
  Artifacts-->>HashJob: distribution files
  HashJob->>GitHubAPI: resolve release tag from SHA
  GitHubAPI-->>HashJob: tag name
  HashJob->>HashJob: compute SHA256 hashes
  HashJob->>SLSAGen: pass hashes & tag
  SLSAGen->>GitHubAPI: request id-token (OIDC)
  GitHubAPI-->>SLSAGen: id-token
  SLSAGen->>SLSAGen: generate intoto attestation
  SLSAGen->>Artifacts: upload multiple.intoto.jsonl
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Suggested labels

ci, security

Poem

🐰 From build to blessing, we've taken a leap,
Hashing artifacts that others do keep,
SLSA3 shines with attestations so fine,
No more building here—just artifacts divine! ✨

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main change: replacing local builds with attestation of pre-built release artifacts to fix SLSA provenance integrity.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/slsa-provenance-download-artifacts

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 26, 2026

⚠️ Deprecation Warning: The deny-licenses option is deprecated for possible removal in the next major release. For more information, see issue 997.

Dependency Review

✅ No vulnerabilities or license issues or OpenSSF Scorecard issues found.

OpenSSF Scorecard

PackageVersionScoreDetails
actions/actions/download-artifact 3e5f45b2cfb9172054b4087a40e8e0b5a5461e7c 🟢 5.3
Details
CheckScoreReason
Packaging⚠️ -1packaging workflow not detected
Dangerous-Workflow🟢 10no dangerous workflow patterns detected
Code-Review🟢 10all changesets reviewed
Maintained⚠️ 23 commit(s) and 0 issue activity found in the last 90 days -- score normalized to 2
Binary-Artifacts🟢 10no binaries found in the repo
CII-Best-Practices⚠️ 0no effort to earn an OpenSSF best practices badge detected
Token-Permissions⚠️ 0detected GitHub workflow tokens with excessive permissions
Pinned-Dependencies⚠️ 0dependency not pinned by hash detected -- score normalized to 0
Fuzzing⚠️ 0project is not fuzzed
License🟢 10license file detected
Signed-Releases⚠️ -1no releases found
Security-Policy🟢 9security policy file detected
SAST🟢 10SAST tool is run on all commits
Branch-Protection⚠️ 0branch protection not enabled on development/release branches
actions/slsa-framework/slsa-github-generator/.github/workflows/generator_generic_slsa3.yml f7dd8c54c2067bafc12ca7a55595d5ee9b75204a 🟢 7
Details
CheckScoreReason
Dependency-Update-Tool🟢 10update tool detected
Code-Review🟢 10all changesets reviewed
Security-Policy🟢 10security policy file detected
Maintained🟢 51 commit(s) and 5 issue activity found in the last 90 days -- score normalized to 5
Binary-Artifacts🟢 10no binaries found in the repo
Dangerous-Workflow🟢 10no dangerous workflow patterns detected
Packaging⚠️ -1packaging workflow not detected
Token-Permissions🟢 10GitHub workflow tokens follow principle of least privilege
CII-Best-Practices🟢 5badge detected: Passing
Pinned-Dependencies🟢 4dependency not pinned by hash detected -- score normalized to 4
SAST🟢 7SAST tool detected but not run on all commits
Fuzzing⚠️ 0project is not fuzzed
License🟢 10license file detected
Signed-Releases🟢 105 out of the last 5 releases have a total of 5 signed artifacts.
Branch-Protection⚠️ 2branch protection is not maximal on development and all release branches
CI-Tests⚠️ 27 out of 28 merged PRs checked by a CI test -- score normalized to 2
Contributors🟢 10project has 33 contributing companies or organizations
Vulnerabilities⚠️ 0107 existing vulnerabilities detected

Scanned Files

  • .github/workflows/slsa-provenance.yml

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the SLSA provenance workflow to attest the exact pre-built release artifacts (downloaded from the upstream release workflow run or from an existing GitHub Release), instead of rebuilding packages inside the provenance job (which can produce non-reproducible bytes and invalidate the attestation).

Changes:

  • Replaces uv build with artifact retrieval:
    • workflow_run: downloads the release-dist artifact from the triggering “Semantic Release” run.
    • workflow_dispatch: downloads wheels/sdists from the matching v<version> GitHub Release via gh.
  • Tightens permissions to workflow-level permissions: {} with job-scoped permissions.
  • Adds guards/validation to prevent generating provenance when dist/ is empty and to validate the manual version input format.

Comment thread .github/workflows/slsa-provenance.yml Outdated
Comment on lines +84 to +90
if ! [[ "$INPUT_VERSION" =~ ^[A-Za-z0-9._+-]+$ ]]; then
echo "::error::Invalid version input: must match ^[A-Za-z0-9._+-]+$"
exit 1
fi
TAG="v$INPUT_VERSION"
mkdir -p dist
# Pull wheels and sdists attached to the release.
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Resolved by williaby in 1f4e9da: the workflow_dispatch input changed from a version string to an integer-validated run_id, so the TAG="v$INPUT_VERSION" concatenation no longer exists. The tag is now resolved by querying the GitHub Releases API for the head SHA.

Comment thread .github/workflows/slsa-provenance.yml Outdated
with:
subject-path: 'dist/*'
set -euo pipefail
HASHES=$(sha256sum ./* | base64 -w0)
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Resolved by williaby in 1f4e9da: the hashing step now uses cd dist && sha256sum -- *.whl *.tar.gz | sort | base64 -w0, restoring bare-name subjects, plus adds a flag-injection guard (--), explicit globs scoped to wheel/sdist, and deterministic ordering.

The previous revision still routed through
`ByronWilliamsCPA/.github/.github/workflows/python-slsa.yml`, which
cannot work as a reusable workflow:

1. The template defines only `on: workflow_dispatch:`, so it has no
   `workflow_call:` entry-point and `uses:` cannot invoke it.
2. The template itself calls `slsa-framework/slsa-github-generator/
   .github/workflows/generator_generic_slsa3.yml`, which is a reusable
   workflow. GitHub Actions forbids nested reusable workflow calls.
3. The template's own header docs say: "This is a TEMPLATE, not a
   reusable workflow ... You must copy the 'provenance' job directly
   into your release workflow."

CI on PR #54 still went green because the PR validation gates only
exercise workflow file syntax; the actual `workflow_run` -> SLSA
provenance flow fires only after a real Semantic Release run and
therefore was never exercised on the PR.

Switch to the inline pattern used by the working exemplar
`ByronWilliamsCPA/homelab-infra` PR #435:

- `hash` job downloads the upstream `release-dist` artifact via
  `actions/download-artifact@v8.0.1` using
  `github.event.workflow_run.id` (or a `run_id` dispatch input).
- Hashes are computed only over `*.whl` and `*.tar.gz` and sorted for
  determinism before base64 encoding.
- A tag resolver picks the tag created by Semantic Release at the
  head SHA, falling back to the most recent release.
- `provenance` job calls the official SLSA generator
  `slsa-framework/slsa-github-generator/.github/workflows/
  generator_generic_slsa3.yml@v2.1.0` directly with
  `base64-subjects`, `upload-assets: true`, and
  `upload-tag-name: ${{ needs.hash.outputs.tag }}`.

The dispatch input changes from `version` (string) to `run_id`
(integer) to match the artifact-download model: provenance is now
keyed off a specific Semantic Release run, not a version string.

Refs: ByronWilliamsCPA/homelab-infra#435
@williaby
Copy link
Copy Markdown
Contributor Author

PR Review

Reviewed at HEAD 1f4e9daf1841ac54ac3f68a77eb7be30a0701fed (2026-05-26 03:03 UTC). All 33 CI checks pass; SonarCloud Quality Gate passed (0 new issues, 0 hotspots); GitGuardian and Socket Security clean.

The architectural pivot to inline the SLSA generator is correct: GitHub Actions forbids nested reusable-workflow calls, and the org python-slsa.yml is workflow_dispatch-only. Both async reviewers' inline comments (Copilot's vv0.1.0 and sha256sum ./* findings) were obsoleted by commit 1f4e9daf 4 minutes after they were posted; those threads can be marked resolved manually.

CodeRabbit hit the organization PR-review rate limit and did not deliver a review.

Important

  • Manual path trust gap. slsa-provenance.yml L66-L94 accepts any positive integer as `run_id` and downloads the `release-dist` artifact from that run without verifying the run belongs to the `Semantic Release` workflow. A user with `Actions: write` could pass a `run_id` from another workflow that uploads an artifact named `release-dist` and mint a SLSA attestation describing those substitute bytes. The automated `workflow_run` path is unaffected by this. Fix: after resolving `RUN_ID`, validate the run with `gh api repos/${GITHUB_REPOSITORY}/actions/runs/${RUN_ID}` and assert `.name == 'Semantic Release'` and `.conclusion == 'success'`.

  • Pre-existing egress-policy gap (informational restate). slsa-provenance.yml L64 keeps `egress-policy: audit` with the existing `TODO: switch to block after 2026-06-30`. Not introduced by this PR. Worth restating only because this PR's central thesis is supply-chain hardening; the deadline is already tracked.

Suggested

  • jq filter uses shell interpolation, not `--arg`. slsa-provenance.yml L117-L119 embeds `${HEAD_BRANCH:-main}` and `${HEAD_SHA}` directly in the `--jq` expression. Not exploitable today (workflow_run constrains both values to safe shapes), but `--arg b ... --arg s ...` is the standard hardening.

  • Latest-release fallback can mis-attach provenance. slsa-provenance.yml L120-L127 falls back to `gh release list --limit 1` if the SHA-based lookup fails. If a newer release publishes between the upstream Semantic Release run and this provenance run, the attestation attaches to the wrong tag silently. Low probability; consider failing explicitly on resolve-miss rather than guessing.


🤖 Generated with Claude Code

Two review-driven hardenings to slsa-provenance.yml:

1. Verify that the supplied run_id belongs to a successful Semantic
   Release workflow run before downloading and attesting its artifacts.
   The workflow_run trigger filter already restricts the automated path,
   but the workflow_dispatch path accepted any positive integer and
   would attest whatever release-dist artifact existed on that run.
   A user with Actions:write could mint a fraudulent SLSA attestation
   describing substitute bytes.

2. Pass HEAD_BRANCH and HEAD_SHA to the jq filter via --arg rather than
   shell interpolation into the filter string. Not exploitable today
   (workflow_run constrains both values), but --arg is the standard
   pattern and removes the entire injection class.

Refs PR #54 review (SEC-005 Important, SEC-007 Suggested).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@williaby
Copy link
Copy Markdown
Contributor Author

PR Fix Summary

Pushed e8eefc2 on top of 1f4e9da. CI is re-running automatically.

Addressed (1 Important + 1 Suggested from the /pr-review pass):

  • SEC-005 (Important). Resolve source run id step now calls gh api repos/${GITHUB_REPOSITORY}/actions/runs/${RUN_ID} and asserts both .name == "Semantic Release" and .conclusion == "success" before continuing. The workflow_run trigger filter already constrained the automated path; this closes the workflow_dispatch path to the same trust level.
  • SEC-007 (Suggested). Resolve release tag step now pipes gh api's output through external jq with --arg b ... --arg s ... instead of shell-interpolating HEAD_BRANCH and HEAD_SHA into the jq filter string. gh api --jq does not accept --arg (it embeds a jq library, not the CLI), so the rewrite uses a real jq invocation downstream of gh api.

Replied to stale review threads:

  • Copilot's vv0.1.0 finding: marked resolved by williaby's 1f4e9da (input redesigned to integer run_id).
  • Copilot's sha256sum ./* finding: marked resolved by williaby's 1f4e9da (now cd dist && sha256sum -- *.whl *.tar.gz | sort).

Deliberately not addressed:

  • SEC-011 (Important, pre-existing). egress-policy: audit is unchanged. It pre-dates this PR and is already tracked by the # TODO: switch to block after 2026-06-30 comment.
  • Latest-release fallback (Suggested). gh release list --limit 1 fallback in Resolve release tag is unchanged. Removing it would be a behavior change that should be williaby's call rather than a drive-by fix.

Pre-commit passing locally (yaml-check, actionlint, em-dash check, trufflehog, github-workflows validation).

🤖 Generated with Claude Code

@sonarqubecloud
Copy link
Copy Markdown

@williaby williaby merged commit cd2cb22 into main May 26, 2026
35 of 36 checks passed
@williaby williaby deleted the fix/slsa-provenance-download-artifacts branch May 26, 2026 03:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants