Skip to content
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
139 changes: 91 additions & 48 deletions .github/workflows/slsa-provenance.yml
Original file line number Diff line number Diff line change
@@ -1,8 +1,21 @@
# SLSA Provenance Generation for RAG Processor
# Generates SLSA Level 3 provenance attestations for published packages.
#
# This workflow builds the package, generates hashes, and calls the org-level
# SLSA provenance workflow to create cryptographic attestations.
# This workflow downloads pre-built distribution artifacts (never rebuilds
# from source), hashes them, and calls the org-level SLSA provenance
# workflow to create cryptographic attestations.
#
# Why download instead of rebuild?
# SLSA attestation must describe the exact files that were published. If
# the provenance job rebuilt the package locally, the attested hashes
# would diverge from the artifacts attached to the GitHub Release (and
# any future PyPI upload), defeating the supply-chain guarantee.
#
# Artifact sources:
# - workflow_run trigger: downloads the `release-dist` artifact uploaded
# by the upstream "Semantic Release" run that produced this version.
# - workflow_dispatch trigger: downloads the dist files attached to the
# matching GitHub Release (`v<version>` tag) via the GitHub CLI.
#
# Reference: https://slsa.dev/
name: SLSA Provenance
Expand All @@ -13,29 +26,28 @@ on:
workflows: ["Semantic Release"]
types: [completed]
branches: [main, master]
# Manual trigger for re-generating provenance
# Manual trigger for re-generating provenance against an existing release
workflow_dispatch:
inputs:
version:
description: 'Version to generate provenance for (e.g., 0.1.0)'
description: 'Released version to generate provenance for (e.g., 0.1.0)'
required: true
type: string

permissions:
contents: write
id-token: write
actions: read
attestations: write
permissions: {}

jobs:
# ==========================================================================
# Build and Generate Hashes
# Download Pre-Built Artifacts and Generate Hashes
# ==========================================================================
build:
name: Build Package
collect:
name: Collect Release Artifacts
runs-on: ubuntu-latest
timeout-minutes: 30
timeout-minutes: 15
if: ${{ github.event_name == 'workflow_dispatch' || github.event.workflow_run.conclusion == 'success' }}
permissions:
contents: read
actions: read
outputs:
hashes: ${{ steps.hashes.outputs.hashes }}
version: ${{ steps.version.outputs.version }}
Expand All @@ -46,64 +58,95 @@ jobs:
with:
egress-policy: audit # TODO: switch to block after 2026-06-30 (SLSA L3 hermetic build isolation)

- name: Checkout repository
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
# workflow_run path: the upstream Semantic Release job uploads dist/
# as the `release-dist` artifact (5-day retention). Pull it directly
# so the hashes attest the published bytes, not a fresh local build.
- name: Download release-dist from upstream Semantic Release run
if: ${{ github.event_name == 'workflow_run' }}
uses: actions/download-artifact@3e5f45b2cfb9172054b4087a40e8e0b5a5461e7c # v8.0.1
with:
fetch-depth: 0

- name: Set up Python
uses: actions/setup-python@a309ff8b426b58ec0e2a45f0f869d46889d02405 # v6.2.0
with:
python-version: "3.12"
name: release-dist
path: dist/
github-token: ${{ github.token }}
repository: ${{ github.repository }}
run-id: ${{ github.event.workflow_run.id }}

- name: Install UV
uses: astral-sh/setup-uv@08807647e7069bb48b6ef5acd8ec9567f424441b # v8.1.0
with:
enable-cache: true
# workflow_dispatch path: a release was published earlier and the
# GHA artifact retention may have lapsed. Download the dist files
# that were attached to the GitHub Release for the requested tag.
- name: Download dist from GitHub Release
if: ${{ github.event_name == 'workflow_dispatch' }}
env:
GH_TOKEN: ${{ github.token }}
INPUT_VERSION: ${{ inputs.version }}
run: |
set -euo pipefail
if ! [[ "$INPUT_VERSION" =~ ^[A-Za-z0-9._+-]+$ ]]; then
echo "::error::Invalid version input: must match ^[A-Za-z0-9._+-]+$"
exit 1
fi
TAG="v$INPUT_VERSION"
mkdir -p dist
# Pull wheels and sdists attached to the release.
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Resolved by williaby in 1f4e9da: the workflow_dispatch input changed from a version string to an integer-validated run_id, so the TAG="v$INPUT_VERSION" concatenation no longer exists. The tag is now resolved by querying the GitHub Releases API for the head SHA.

gh release download "$TAG" \
--repo "$GITHUB_REPOSITORY" \
--dir dist \
--pattern '*.whl' \
--pattern '*.tar.gz'

- name: Determine version
id: version
env:
INPUT_VERSION: ${{ github.event.inputs.version }}
EVENT_NAME: ${{ github.event_name }}
INPUT_VERSION: ${{ inputs.version }}
run: |
if [ -n "$INPUT_VERSION" ]; then
set -euo pipefail
if [ "$EVENT_NAME" = "workflow_dispatch" ]; then
VERSION="$INPUT_VERSION"
else
VERSION=$(grep -Po '(?<=^version = ")[^"]*' pyproject.toml)
# Extract version from the wheel filename produced upstream.
WHEEL=$(find dist -maxdepth 1 -name '*.whl' -print -quit)
if [ -z "$WHEEL" ]; then
echo "::error::No wheel found in downloaded dist/ artifact"
find dist -maxdepth 1 -type f -printf '%p\n' || true
exit 1
fi
BASENAME=$(basename "$WHEEL")
VERSION=$(echo "$BASENAME" | awk -F'-' '{print $2}')
fi
echo "version=$VERSION" >> $GITHUB_OUTPUT
echo "version=$VERSION" >> "$GITHUB_OUTPUT"

- name: Build package
run: uv build
- name: Verify dist contents
run: |
set -euo pipefail
echo "Files attested by this provenance run:"
find dist -maxdepth 1 -type f -printf '%f %s bytes\n'
# Refuse to attest an empty directory; SLSA generator would
# otherwise emit a meaningless provenance.
WHEEL=$(find dist -maxdepth 1 -name '*.whl' -print -quit)
SDIST=$(find dist -maxdepth 1 -name '*.tar.gz' -print -quit)
if [ -z "$WHEEL" ] && [ -z "$SDIST" ]; then
echo "::error::No wheel or sdist found in dist/; nothing to attest"
exit 1
fi

- name: Generate SHA256 hashes
id: hashes
working-directory: dist
run: |
cd dist
HASHES=$(sha256sum * | base64 -w0)
echo "hashes=$HASHES" >> $GITHUB_OUTPUT

- name: Upload build artifacts
uses: actions/upload-artifact@043fb46d1a93c77aae656e7c1c64a875d1fc6a0a # v7.0.1
with:
name: dist-${{ steps.version.outputs.version }}
path: dist/
retention-days: 90

- name: Generate artifact attestation
uses: actions/attest-build-provenance@a2bbfa25375fe432b6a289bc6b6cd05ecd0c4c32 # v4.1.0
with:
subject-path: 'dist/*'
set -euo pipefail
HASHES=$(sha256sum ./* | base64 -w0)
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Resolved by williaby in 1f4e9da: the hashing step now uses cd dist && sha256sum -- *.whl *.tar.gz | sort | base64 -w0, restoring bare-name subjects, plus adds a flag-injection guard (--), explicit globs scoped to wheel/sdist, and deterministic ordering.

echo "hashes=$HASHES" >> "$GITHUB_OUTPUT"

# ==========================================================================
# SLSA Level 3 Provenance (Org-Level Reusable Workflow)
# ==========================================================================
slsa:
name: SLSA Level 3
needs: [build]
needs: [collect]
uses: ByronWilliamsCPA/.github/.github/workflows/python-slsa.yml@961eb17d8e9b7fe0d8bfc5dbe9d23c824484fb11 # main
with:
base64-subjects: ${{ needs.build.outputs.hashes }}
base64-subjects: ${{ needs.collect.outputs.hashes }}
upload-assets: true
permissions:
id-token: write
Expand Down
Loading