fix(ci): SLSA provenance must attest pre-built release artifacts by williaby · Pull Request #54 · ByronWilliamsCPA/rag-processor

williaby · 2026-05-26T02:56:27Z

Problem

slsa-provenance.yml ran uv build inside the provenance job, then hashed those freshly-built artifacts. Because Python builds are not bit-for-bit reproducible, the attestation described files that diverged from whatever the upstream Semantic Release run actually published to the GitHub Release (and would diverge from PyPI uploads once publish-to-pypi is enabled). The supply-chain guarantee was meaningless.

This is the recurring pattern documented at ~/.claude/projects/-home-byron-dev--claude/memory/feedback_slsa_provenance_pattern.md: a SLSA provenance job must hash the exact files that shipped, never a fresh local rebuild.

Fix

Replace the local uv build with artifact retrieval, branching on the trigger:

workflow_run: download the release-dist GitHub Actions artifact (uploaded by python-release.yml, 5-day retention) from the upstream Semantic Release run via actions/download-artifact@v8.0.1 with run-id: github.event.workflow_run.id.
workflow_dispatch: download the wheels and sdists attached to the matching v<version> GitHub Release using gh release download (covers re-runs after GHA artifact retention has lapsed).

The provenance now attests the same bytes that landed in the release.

Safety

Added a guard that fails the job when dist/ contains neither a wheel nor an sdist so the SLSA generator never emits empty provenance.
Workflow-level permissions: {} with job-scoped contents: read, actions: read for the collect job; the slsa job keeps id-token: write + contents: write + actions: read for the org reusable.
INPUT_VERSION validated against ^[A-Za-z0-9._+-]+$ before interpolation.

Verification

actionlint clean.
No em-dashes (PC-011 compliant).
Branch built from origin/main in a per-repo worktree at .worktrees/slsa-provenance-fix.
CI on this PR.

Refs: feedback_slsa_provenance_pattern.md

Summary by CodeRabbit

Chores
- Updated release provenance workflow to generate SLSA Level 3 provenance from published artifacts.
- Improved security and reliability of release artifact verification.

The slsa-provenance.yml workflow was running `uv build` to regenerate dist/ inside the provenance job, then hashing those fresh artifacts. Because Python builds are not bit-for-bit reproducible, the resulting SLSA attestation described files that diverged from what the upstream Semantic Release job actually published to the GitHub Release. The chain of custody was meaningless. Replace the local build with artifact retrieval: - workflow_run trigger: download the `release-dist` GHA artifact from the upstream Semantic Release run (run-id from github.event.workflow_run.id) using actions/download-artifact v8.0.1. - workflow_dispatch trigger: download the wheels and sdists attached to the matching `v<version>` GitHub Release via `gh release download`. The provenance now attests the bytes that were actually shipped, which is the entire point of SLSA L3 supply-chain attestation. Adds a guard that fails the run when dist/ contains neither a wheel nor an sdist so the SLSA generator never emits an empty provenance. Refs: ~/.claude/projects/-home-byron-dev--claude/memory/feedback_slsa_provenance_pattern.md

coderabbitai · 2026-05-26T02:56:34Z

Caution

Review failed

Pull request was closed or merged during review

Walkthrough

The SLSA Provenance workflow was restructured to generate SLSA Level 3 provenance from published release artifacts instead of building and hashing in-workflow. Manual invocation now uses an optional run_id input. A new hash job validates the upstream Semantic Release run, downloads release artifacts, and computes deterministic hashes. The provenance job then calls the official SLSA3 generator inline with those hashes to produce standardized attestations.

Changes

SLSA Provenance Workflow Refactoring

Layer / File(s)	Summary
Workflow Trigger and Contract Updates `.github/workflows/slsa-provenance.yml`	Workflow metadata, triggers, and job interface updated: manual invocation contract changes from required `version` input to optional `run_id`; new hash job output interface exposes `hashes` and `tag` for downstream provenance generation.
Artifact Download and Hashing Pipeline `.github/workflows/slsa-provenance.yml`	Hash job validates upstream Semantic Release run (prefers explicit `run_id` when provided, otherwise uses triggering run), hard-fails on non-successful or non-Semantic Release runs, downloads `release-dist` artifact, errors on empty artifacts, resolves release tag via GitHub API, and computes deterministic base64-encoded SHA256 hashes for published `.whl` and `.tar.gz` files only.
Inline SLSA3 Provenance Generation `.github/workflows/slsa-provenance.yml`	Provenance job calls official `slsa-github-generator` generic SLSA3 generator inline, wiring computed hashes as subjects, enabling artifact upload, attaching resolved tag via `upload-tag-name`, setting provenance output name to `multiple.intoto.jsonl`, and using id-token permissions for OIDC-based token issuance.

Sequence Diagram

sequenceDiagram
  participant SemanticRelease as Semantic Release Workflow
  participant HashJob as Hash Job
  participant GitHubAPI as GitHub API
  participant SLSAGen as SLSA Generator
  participant Artifacts as Release Artifacts
  
  SemanticRelease->>Artifacts: publish release-dist (*.whl, *.tar.gz)
  HashJob->>GitHubAPI: resolve upstream run (prefer run_id)
  GitHubAPI-->>HashJob: run metadata & validation
  HashJob->>Artifacts: download release-dist
  Artifacts-->>HashJob: distribution files
  HashJob->>GitHubAPI: resolve release tag from SHA
  GitHubAPI-->>HashJob: tag name
  HashJob->>HashJob: compute SHA256 hashes
  HashJob->>SLSAGen: pass hashes & tag
  SLSAGen->>GitHubAPI: request id-token (OIDC)
  GitHubAPI-->>SLSAGen: id-token
  SLSAGen->>SLSAGen: generate intoto attestation
  SLSAGen->>Artifacts: upload multiple.intoto.jsonl

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Suggested labels

ci, security

Poem

🐰 From build to blessing, we've taken a leap,
Hashing artifacts that others do keep,
SLSA3 shines with attestations so fine,
No more building here—just artifacts divine! ✨

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately describes the main change: replacing local builds with attestation of pre-built release artifacts to fix SLSA provenance integrity.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch fix/slsa-provenance-download-artifacts

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

github-actions · 2026-05-26T02:56:55Z

⚠️ Deprecation Warning: The deny-licenses option is deprecated for possible removal in the next major release. For more information, see issue 997.

Dependency Review

✅ No vulnerabilities or license issues or OpenSSF Scorecard issues found.

OpenSSF Scorecard

Package

Version

Score

Details

actions/actions/download-artifact

3e5f45b2cfb9172054b4087a40e8e0b5a5461e7c

🟢 5.3

Details

Check	Score	Reason
Packaging	⚠️ -1	packaging workflow not detected
Dangerous-Workflow	🟢 10	no dangerous workflow patterns detected
Code-Review	🟢 10	all changesets reviewed
Maintained	⚠️ 2	3 commit(s) and 0 issue activity found in the last 90 days -- score normalized to 2
Binary-Artifacts	🟢 10	no binaries found in the repo
CII-Best-Practices	⚠️ 0	no effort to earn an OpenSSF best practices badge detected
Token-Permissions	⚠️ 0	detected GitHub workflow tokens with excessive permissions
Pinned-Dependencies	⚠️ 0	dependency not pinned by hash detected -- score normalized to 0
Fuzzing	⚠️ 0	project is not fuzzed
License	🟢 10	license file detected
Signed-Releases	⚠️ -1	no releases found
Security-Policy	🟢 9	security policy file detected
SAST	🟢 10	SAST tool is run on all commits
Branch-Protection	⚠️ 0	branch protection not enabled on development/release branches

actions/slsa-framework/slsa-github-generator/.github/workflows/generator_generic_slsa3.yml

f7dd8c54c2067bafc12ca7a55595d5ee9b75204a

🟢 7

Details

Check	Score	Reason
Dependency-Update-Tool	🟢 10	update tool detected
Code-Review	🟢 10	all changesets reviewed
Security-Policy	🟢 10	security policy file detected
Maintained	🟢 5	1 commit(s) and 5 issue activity found in the last 90 days -- score normalized to 5
Binary-Artifacts	🟢 10	no binaries found in the repo
Dangerous-Workflow	🟢 10	no dangerous workflow patterns detected
Packaging	⚠️ -1	packaging workflow not detected
Token-Permissions	🟢 10	GitHub workflow tokens follow principle of least privilege
CII-Best-Practices	🟢 5	badge detected: Passing
Pinned-Dependencies	🟢 4	dependency not pinned by hash detected -- score normalized to 4
SAST	🟢 7	SAST tool detected but not run on all commits
Fuzzing	⚠️ 0	project is not fuzzed
License	🟢 10	license file detected
Signed-Releases	🟢 10	5 out of the last 5 releases have a total of 5 signed artifacts.
Branch-Protection	⚠️ 2	branch protection is not maximal on development and all release branches
CI-Tests	⚠️ 2	7 out of 28 merged PRs checked by a CI test -- score normalized to 2
Contributors	🟢 10	project has 33 contributing companies or organizations
Vulnerabilities	⚠️ 0	107 existing vulnerabilities detected

Scanned Files

.github/workflows/slsa-provenance.yml

Copilot

Pull request overview

This PR updates the SLSA provenance workflow to attest the exact pre-built release artifacts (downloaded from the upstream release workflow run or from an existing GitHub Release), instead of rebuilding packages inside the provenance job (which can produce non-reproducible bytes and invalidate the attestation).

Changes:

Replaces uv build with artifact retrieval:
- workflow_run: downloads the release-dist artifact from the triggering “Semantic Release” run.
- workflow_dispatch: downloads wheels/sdists from the matching v<version> GitHub Release via gh.
Tightens permissions to workflow-level permissions: {} with job-scoped permissions.
Adds guards/validation to prevent generating provenance when dist/ is empty and to validate the manual version input format.

williaby · 2026-05-26T03:17:51Z

+          if ! [[ "$INPUT_VERSION" =~ ^[A-Za-z0-9._+-]+$ ]]; then
+            echo "::error::Invalid version input: must match ^[A-Za-z0-9._+-]+$"
+            exit 1
+          fi
+          TAG="v$INPUT_VERSION"
+          mkdir -p dist
+          # Pull wheels and sdists attached to the release.


Resolved by williaby in 1f4e9da: the workflow_dispatch input changed from a version string to an integer-validated run_id, so the TAG="v$INPUT_VERSION" concatenation no longer exists. The tag is now resolved by querying the GitHub Releases API for the head SHA.

williaby · 2026-05-26T03:17:55Z

-        with:
-          subject-path: 'dist/*'
+          set -euo pipefail
+          HASHES=$(sha256sum ./* | base64 -w0)


Resolved by williaby in 1f4e9da: the hashing step now uses cd dist && sha256sum -- *.whl *.tar.gz | sort | base64 -w0, restoring bare-name subjects, plus adds a flag-injection guard (--), explicit globs scoped to wheel/sdist, and deterministic ordering.

The previous revision still routed through `ByronWilliamsCPA/.github/.github/workflows/python-slsa.yml`, which cannot work as a reusable workflow: 1. The template defines only `on: workflow_dispatch:`, so it has no `workflow_call:` entry-point and `uses:` cannot invoke it. 2. The template itself calls `slsa-framework/slsa-github-generator/ .github/workflows/generator_generic_slsa3.yml`, which is a reusable workflow. GitHub Actions forbids nested reusable workflow calls. 3. The template's own header docs say: "This is a TEMPLATE, not a reusable workflow ... You must copy the 'provenance' job directly into your release workflow." CI on PR #54 still went green because the PR validation gates only exercise workflow file syntax; the actual `workflow_run` -> SLSA provenance flow fires only after a real Semantic Release run and therefore was never exercised on the PR. Switch to the inline pattern used by the working exemplar `ByronWilliamsCPA/homelab-infra` PR #435: - `hash` job downloads the upstream `release-dist` artifact via `actions/download-artifact@v8.0.1` using `github.event.workflow_run.id` (or a `run_id` dispatch input). - Hashes are computed only over `*.whl` and `*.tar.gz` and sorted for determinism before base64 encoding. - A tag resolver picks the tag created by Semantic Release at the head SHA, falling back to the most recent release. - `provenance` job calls the official SLSA generator `slsa-framework/slsa-github-generator/.github/workflows/ generator_generic_slsa3.yml@v2.1.0` directly with `base64-subjects`, `upload-assets: true`, and `upload-tag-name: ${{ needs.hash.outputs.tag }}`. The dispatch input changes from `version` (string) to `run_id` (integer) to match the artifact-download model: provenance is now keyed off a specific Semantic Release run, not a version string. Refs: ByronWilliamsCPA/homelab-infra#435

williaby · 2026-05-26T03:12:44Z

PR Review

Reviewed at HEAD 1f4e9daf1841ac54ac3f68a77eb7be30a0701fed (2026-05-26 03:03 UTC). All 33 CI checks pass; SonarCloud Quality Gate passed (0 new issues, 0 hotspots); GitGuardian and Socket Security clean.

The architectural pivot to inline the SLSA generator is correct: GitHub Actions forbids nested reusable-workflow calls, and the org python-slsa.yml is workflow_dispatch-only. Both async reviewers' inline comments (Copilot's vv0.1.0 and sha256sum ./* findings) were obsoleted by commit 1f4e9daf 4 minutes after they were posted; those threads can be marked resolved manually.

CodeRabbit hit the organization PR-review rate limit and did not deliver a review.

Important

Manual path trust gap. slsa-provenance.yml L66-L94 accepts any positive integer as `run_id` and downloads the `release-dist` artifact from that run without verifying the run belongs to the `Semantic Release` workflow. A user with `Actions: write` could pass a `run_id` from another workflow that uploads an artifact named `release-dist` and mint a SLSA attestation describing those substitute bytes. The automated `workflow_run` path is unaffected by this. Fix: after resolving `RUN_ID`, validate the run with `gh api repos/${GITHUB_REPOSITORY}/actions/runs/${RUN_ID}` and assert `.name == 'Semantic Release'` and `.conclusion == 'success'`.
Pre-existing egress-policy gap (informational restate). slsa-provenance.yml L64 keeps `egress-policy: audit` with the existing `TODO: switch to block after 2026-06-30`. Not introduced by this PR. Worth restating only because this PR's central thesis is supply-chain hardening; the deadline is already tracked.

Suggested

jq filter uses shell interpolation, not `--arg`. slsa-provenance.yml L117-L119 embeds `${HEAD_BRANCH:-main}` and `${HEAD_SHA}` directly in the `--jq` expression. Not exploitable today (workflow_run constrains both values to safe shapes), but `--arg b ... --arg s ...` is the standard hardening.
Latest-release fallback can mis-attach provenance. slsa-provenance.yml L120-L127 falls back to `gh release list --limit 1` if the SHA-based lookup fails. If a newer release publishes between the upstream Semantic Release run and this provenance run, the attestation attaches to the wrong tag silently. Low probability; consider failing explicitly on resolve-miss rather than guessing.

🤖 Generated with Claude Code

Two review-driven hardenings to slsa-provenance.yml: 1. Verify that the supplied run_id belongs to a successful Semantic Release workflow run before downloading and attesting its artifacts. The workflow_run trigger filter already restricts the automated path, but the workflow_dispatch path accepted any positive integer and would attest whatever release-dist artifact existed on that run. A user with Actions:write could mint a fraudulent SLSA attestation describing substitute bytes. 2. Pass HEAD_BRANCH and HEAD_SHA to the jq filter via --arg rather than shell interpolation into the filter string. Not exploitable today (workflow_run constrains both values), but --arg is the standard pattern and removes the entire injection class. Refs PR #54 review (SEC-005 Important, SEC-007 Suggested). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

williaby · 2026-05-26T03:18:12Z

PR Fix Summary

Pushed e8eefc2 on top of 1f4e9da. CI is re-running automatically.

Addressed (1 Important + 1 Suggested from the /pr-review pass):

SEC-005 (Important). Resolve source run id step now calls gh api repos/${GITHUB_REPOSITORY}/actions/runs/${RUN_ID} and asserts both .name == "Semantic Release" and .conclusion == "success" before continuing. The workflow_run trigger filter already constrained the automated path; this closes the workflow_dispatch path to the same trust level.
SEC-007 (Suggested). Resolve release tag step now pipes gh api's output through external jq with --arg b ... --arg s ... instead of shell-interpolating HEAD_BRANCH and HEAD_SHA into the jq filter string. gh api --jq does not accept --arg (it embeds a jq library, not the CLI), so the rewrite uses a real jq invocation downstream of gh api.

Replied to stale review threads:

Copilot's vv0.1.0 finding: marked resolved by williaby's 1f4e9da (input redesigned to integer run_id).
Copilot's sha256sum ./* finding: marked resolved by williaby's 1f4e9da (now cd dist && sha256sum -- *.whl *.tar.gz | sort).

Deliberately not addressed:

SEC-011 (Important, pre-existing). egress-policy: audit is unchanged. It pre-dates this PR and is already tracked by the # TODO: switch to block after 2026-06-30 comment.
Latest-release fallback (Suggested). gh release list --limit 1 fallback in Resolve release tag is unchanged. Removing it would be a behavior change that should be williaby's call rather than a drive-by fix.

Pre-commit passing locally (yaml-check, actionlint, em-dash check, trufflehog, github-workflows validation).

🤖 Generated with Claude Code

sonarqubecloud · 2026-05-26T03:19:07Z

Quality Gate passed

Issues
0 New issues
0 Accepted issues

Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarQube Cloud

Copilot AI review requested due to automatic review settings May 26, 2026 02:56

Copilot started reviewing on behalf of williaby May 26, 2026 02:56 View session

Copilot AI reviewed May 26, 2026

View reviewed changes

coderabbitai Bot added ci security labels May 26, 2026

williaby merged commit cd2cb22 into main May 26, 2026
35 of 36 checks passed

williaby deleted the fix/slsa-provenance-download-artifacts branch May 26, 2026 03:19

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(ci): SLSA provenance must attest pre-built release artifacts#54

fix(ci): SLSA provenance must attest pre-built release artifacts#54
williaby merged 3 commits into
mainfrom
fix/slsa-provenance-download-artifacts

williaby commented May 26, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented May 26, 2026 •

edited

Loading

Review failed

Uh oh!

github-actions Bot commented May 26, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

williaby May 26, 2026

Uh oh!

williaby May 26, 2026

Uh oh!

williaby commented May 26, 2026

Uh oh!

williaby commented May 26, 2026

Uh oh!

sonarqubecloud Bot commented May 26, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

williaby commented May 26, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Problem

Fix

Safety

Verification

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented May 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review failed

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Suggested labels

Poem

Uh oh!

github-actions Bot commented May 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Dependency Review

OpenSSF Scorecard

Scanned Files

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

williaby May 26, 2026

Choose a reason for hiding this comment

Uh oh!

williaby May 26, 2026

Choose a reason for hiding this comment

Uh oh!

williaby commented May 26, 2026

PR Review

Important

Suggested

Uh oh!

williaby commented May 26, 2026

PR Fix Summary

Uh oh!

sonarqubecloud Bot commented May 26, 2026

Quality Gate passed

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

williaby commented May 26, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 26, 2026 •

edited

Loading

github-actions Bot commented May 26, 2026 •

edited

Loading