Skip to content
Merged
Show file tree
Hide file tree
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 12 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,16 @@ are no numbered releases.
2026) makes the SBOM scanner itself a supply-chain risk; Grype (Anchore) is
unaffected.

- `scripts/fleet-audit-sha-pins.sh`: new `STRICT_AUDIT=1` environment toggle.
When set, the script exits with status 2 if any repo emitted a `repo,error`
row or if any org's repo count saturated `REPO_LIMIT`. CI gates that need
fail-closed semantics opt in via `STRICT_AUDIT=1`; the default flow remains
report-only (exit 0) so existing consumers are unaffected. Companion
change: `REPO_LIMIT` now reads from environment with a default of 1000
(`REPO_LIMIT="${REPO_LIMIT:-1000}"`), letting tests exercise the
saturation WARN with a small stub-friendly value and giving operators an
escape hatch for fleets that grow beyond the default.

### Breaking Changes

- `python-security-analysis.yml`: the `safety` SCA scanner has been
Expand Down Expand Up @@ -178,6 +188,8 @@ are no numbered releases.
- `scripts/fleet-audit-sha-pins.sh`: distinguish legitimate HTTP 404 (no workflows directory) from real API failures (rate limit, 5xx, auth loss); real failures now emit a `repo,error` sentinel row and a stderr WARN line instead of silently appearing as `repo,0`. The 404 discriminator now requires the exact `(HTTP 404)` parenthesized form so an upstream 5xx response containing the substring "Not Found" cannot be misclassified as a missing resource. Replaces the regex-based `SKIP_OWNERS` check with a literal `case` match so operator-supplied values containing regex metacharacters (e.g., `.*`) cannot silently classify every owner as skipped. Adds a stderr WARN when a per-file 404 occurs mid-iteration (the directory listing succeeded but the specific file fetch returned 404, indicating a race or scope mismatch). Replaces the silent `|| true` masking on the base64 decode with explicit rc handling that propagates decode failures to the `repo,error` sentinel. Addresses the silent-failure and regex-injection findings from the PR #175 follow-up review.
- `scripts/update-pinned-actions.sh`: `extract_tag_pins` and `extract_branch_pins` now accept refs that carry an inline `# comment` (e.g., `actions/checkout@v4 # stable`); the converter previously required the ref to be the last non-whitespace content on the line, leaving inline-annotated refs as lingering audit violations. The sed substitution in `--pin-tags --apply` now spans to end-of-line so any pre-existing inline comment is replaced cleanly by the new `# <tag>` comment. Escapes sed-replacement metacharacters (`\`, `&`, `|`) in the resolved tag name AND sed-pattern metacharacters in `current_tag` and `full_action` before splicing into the substitution; a semver `.` in `current_tag` no longer matches arbitrary characters and tags containing the sed delimiter `|` no longer terminate the pattern field. Replaces the regex-based `--owner-allowlist` filter in both extractors with an awk literal-string lookup, eliminating the regex-injection vector where `--owner-allowlist '.*'` would silently classify every action as first-party. Replaces the `{ pipeline; } || true` wrappers in both extractors with explicit grep rc handling so a real failure (rc>=2: unreadable directory, permission denied) surfaces as a stderr WARN and a non-zero return instead of a silent empty result. Replaces the two-call `resolve_tag_sha` pattern with a single composite-jq fetch, removing the race window where a force-push or rate-limit hit between two fetches of the same ref could return disagreeing values. Makes the EXIT trap BSD-compatible by using `${VAR:+"$VAR"}` so `rm -f` never receives an empty-string argument on macOS. Hoists the `CHANGE_LOG` temp file and its cleanup trap to script scope so `pin_tags_main` no longer relies on a function-local `RETURN` trap; both extractor pipelines tolerate zero matches under `set -euo pipefail` via the new rc-aware path.
- `tests/update-pinned-actions.bats`, `tests/fleet-audit-sha-pins.bats`: add eight new test cases covering the hardening above. The five from the original PR #175 follow-up commit (inline-comment conversion, custom `--owner-allowlist`, two-step annotated-tag dereferencing, audit emits `repo,error` on non-404 failure, audit treats 404 on the workflows directory as a legitimate zero) are joined by three regression-vector tests: sed-escape of `&` and `|` characters in tag-name comments, `--skip-owners` with regex metacharacters still counts violations, and mixed-success inner-loop (one workflow file fetches OK, a second fails, repo row flips to `error` not a partial numeric count).
- `scripts/update-pinned-actions.sh`: `resolve_tag_sha` now validates that the `gh api` composite jq response contains non-empty, non-`"null"` values for `.object.type` and `.object.sha`. The previous code accepted the literal string `"null"` (emitted by `jq -r` for missing fields) as a successful resolution, which produced patterns like `actions/checkout@null` that the caller wrote into the workflow file. The same guard now also fires after the second annotated-tag dereferencing call. Action refs that fail validation are marked SKIPPED rather than written with a malformed SHA.
- `tests/fleet-audit-sha-pins.bats`, `tests/update-pinned-actions.bats`: add nine more regression-vector tests guarding the Suggested-tier fixes. fleet-audit gains: REPO_LIMIT saturation WARN with env-overridable REPO_LIMIT, STRICT_AUDIT=1 fail-closed on saturation, STRICT_AUDIT=1 fail-closed on per-repo error, STRICT_AUDIT=1 clean-run regression guard, SKIP_OWNERS edge cases (single value, empty, double internal commas). update-pinned-actions gains: extract_branch_pins reports comment-annotated branch refs, --pin-tags --apply succeeds with only-first-party fixtures, --pin-tags --apply skips when annotated-tag dereferencing returns missing `.object.sha`.

### Security

Expand Down
30 changes: 29 additions & 1 deletion scripts/fleet-audit-sha-pins.sh
Original file line number Diff line number Diff line change
Expand Up @@ -14,9 +14,22 @@
# Exit code: 0 even when violations or errors are present (this is a report,
# not a gate). Errors are surfaced via the sentinel row plus stderr.
#
# Environment overrides:
# REPO_LIMIT Max repos fetched per org via `gh repo list` (default 1000).
# Lower it for testing; raise it if your fleet exceeds 1000
# repos. The script emits a stderr WARN when the result count
# equals REPO_LIMIT (saturation, audit may be truncated).
# STRICT_AUDIT When set to "1", exit non-zero at the end of the run if any
# API failure produced a `repo,error` row OR any org's repo
# count saturated REPO_LIMIT. Use this in CI gates that
# should fail closed when the audit is incomplete. The
# default (unset / "0") preserves report-only semantics.
#
# Usage:
# ./scripts/fleet-audit-sha-pins.sh --org ByronWilliamsCPA [--org williaby] \
# [--skip-owners ByronWilliamsCPA,williaby] [--output audit.csv]
# STRICT_AUDIT=1 ./scripts/fleet-audit-sha-pins.sh --org ByronWilliamsCPA
# REPO_LIMIT=3 ./scripts/fleet-audit-sha-pins.sh --org ByronWilliamsCPA
# =============================================================================

set -euo pipefail
Expand Down Expand Up @@ -96,12 +109,19 @@ is_skipped_owner() {
esac
}

REPO_LIMIT=1000
# REPO_LIMIT defaults to 1000 but may be overridden via environment for tests
# and for fleets that grow beyond the default.
REPO_LIMIT="${REPO_LIMIT:-1000}"

# Tracks whether the audit was incomplete due to API errors or saturation;
# STRICT_AUDIT consults this at exit to decide whether to fail closed.
audit_incomplete=false

for org in "${ORGS[@]}"; do
mapfile -t repos < <(gh repo list "$org" --limit "$REPO_LIMIT" --json name --jq '.[].name')
if [[ ${#repos[@]} -eq $REPO_LIMIT ]]; then
echo "WARN: $org returned exactly $REPO_LIMIT repos; audit may be truncated. Raise REPO_LIMIT or page through results." >&2
audit_incomplete=true
fi
for repo in "${repos[@]}"; do
full="$org/$repo"
Expand Down Expand Up @@ -167,8 +187,16 @@ for org in "${ORGS[@]}"; do

if [[ "$api_error" == true ]]; then
emit "$full,error"
audit_incomplete=true
else
emit "$full,$violations"
Comment on lines 225 to 229
fi
done
done

# Fail closed when running under STRICT_AUDIT=1. The default flow remains
# report-only (exit 0) so existing CI consumers do not break.
if [[ "${STRICT_AUDIT:-0}" == "1" && "$audit_incomplete" == true ]]; then
echo "ERROR: STRICT_AUDIT=1 and the audit was incomplete (API errors and/or REPO_LIMIT saturation). Exiting non-zero." >&2
exit 2
fi
21 changes: 20 additions & 1 deletion scripts/update-pinned-actions.sh
Original file line number Diff line number Diff line change
Expand Up @@ -131,9 +131,22 @@ resolve_tag_sha() {
obj_type="${pair%%|*}"
obj_sha="${pair#*|}"

# jq emits the literal string "null" under -r when a referenced field
# is missing or null. Treat that as a resolution failure so the caller
# marks the action SKIPPED rather than writing "actions/checkout@null"
# into the workflow file.
if [[ -z "$obj_type" || -z "$obj_sha" || "$obj_type" == "null" || "$obj_sha" == "null" ]]; then
echo ""
return
fi

if [[ "$obj_type" == "tag" ]]; then
# Annotated tag: resolve tag object to its target commit SHA
obj_sha=$(gh api "repos/$repo/git/tags/$obj_sha" --jq '.object.sha' 2>/dev/null) || { echo ""; return; }
if [[ -z "$obj_sha" || "$obj_sha" == "null" ]]; then
echo ""
return
fi
fi

echo "$obj_sha"
Expand Down Expand Up @@ -283,7 +296,13 @@ pin_tags_main() {
while IFS='|' read -r full_action current_tag new_sha latest_tag; do
local _pat _rep safe_tag safe_action safe_current_tag
# Escape sed-replacement metacharacters in latest_tag (the replacement
# side). Order matters: backslash first.
# side, not the pattern). Replacement strings have three specials:
# \ literal backslash (must escape first to avoid double-escape)
# & the full matched pattern (would splice the ref back in)
# | only special because the script uses `s|...|...|g`; if the
# delimiter changes to `/` (or any other char), this set must
# be updated to match. The matching pattern-side escape lives
# in escape_sed_pat() below.
safe_tag="${latest_tag//\\/\\\\}"
safe_tag="${safe_tag//&/\\&}"
safe_tag="${safe_tag//|/\\|}"
Expand Down
118 changes: 118 additions & 0 deletions tests/fleet-audit-sha-pins.bats
Original file line number Diff line number Diff line change
Expand Up @@ -248,3 +248,121 @@ STUB
assert_output --partial "ByronWilliamsCPA/repo-partial-fail,error"
refute_output --partial "ByronWilliamsCPA/repo-partial-fail,1"
}

@test "audit warns when gh repo list saturates REPO_LIMIT" {
# REPO_LIMIT is env-overridable. Stub returns exactly 3 repos and the test
# sets REPO_LIMIT=3; the script must emit a stderr WARN that this run may
# be truncated. Without the env override, exercising this WARN required
# arranging a 1000-repo stub.
cat > "$TEST_TMPDIR/bin/gh" <<'STUB'
#!/usr/bin/env bash
if [[ "$1 $2" == "auth status" ]]; then exit 0; fi
if [[ "$1" == "repo" && "$2" == "list" ]]; then
_apply_jq_flag '[{"name":"a"},{"name":"b"},{"name":"c"}]' "$@"
exit 0
fi
if [[ "$1" == "api" ]]; then
# 404 on contents/.github/workflows so each repo records 0 cleanly
echo "gh: Not Found (HTTP 404)" >&2
exit 1
fi
exit 0
STUB
chmod +x "$TEST_TMPDIR/bin/gh"

REPO_LIMIT=3 run "$SCRIPT" --org ByronWilliamsCPA --output "$TEST_TMPDIR/out.csv"
assert_success
assert_output --partial "WARN: ByronWilliamsCPA returned exactly 3 repos"
}

@test "audit STRICT_AUDIT=1 exits non-zero on REPO_LIMIT saturation" {
cat > "$TEST_TMPDIR/bin/gh" <<'STUB'
#!/usr/bin/env bash
if [[ "$1 $2" == "auth status" ]]; then exit 0; fi
if [[ "$1" == "repo" && "$2" == "list" ]]; then
_apply_jq_flag '[{"name":"a"},{"name":"b"}]' "$@"
exit 0
fi
if [[ "$1" == "api" ]]; then
echo "gh: Not Found (HTTP 404)" >&2
exit 1
fi
exit 0
STUB
chmod +x "$TEST_TMPDIR/bin/gh"

REPO_LIMIT=2 STRICT_AUDIT=1 run "$SCRIPT" --org ByronWilliamsCPA --output "$TEST_TMPDIR/out.csv"
# Script exits 2 to distinguish "audit incomplete" from a generic error
[[ "$status" -eq 2 ]] || { echo "expected exit 2 under STRICT_AUDIT, got $status"; return 1; }
assert_output --partial "ERROR: STRICT_AUDIT=1 and the audit was incomplete"
}

@test "audit STRICT_AUDIT=1 exits non-zero when any repo emits error sentinel" {
cat > "$TEST_TMPDIR/bin/gh" <<'STUB'
#!/usr/bin/env bash
if [[ "$1 $2" == "auth status" ]]; then exit 0; fi
if [[ "$1" == "repo" && "$2" == "list" ]]; then
_apply_jq_flag '[{"name":"repo-down"}]' "$@"
exit 0
fi
if [[ "$1" == "api" ]]; then
echo "gh: API rate limit exceeded for user (HTTP 429)" >&2
exit 1
fi
exit 0
STUB
chmod +x "$TEST_TMPDIR/bin/gh"

STRICT_AUDIT=1 run "$SCRIPT" --org ByronWilliamsCPA --output "$TEST_TMPDIR/out.csv"
[[ "$status" -eq 2 ]] || { echo "expected exit 2 under STRICT_AUDIT, got $status"; return 1; }
assert_output --partial "ERROR: STRICT_AUDIT=1 and the audit was incomplete"
run cat "$TEST_TMPDIR/out.csv"
assert_output --partial "ByronWilliamsCPA/repo-down,error"
}

@test "audit STRICT_AUDIT=1 remains exit 0 on a fully clean run (no errors, no saturation)" {
# Regression guard: a clean run under STRICT_AUDIT must still succeed.
# Otherwise CI gates would fire on every healthy audit.
_write_gh_stub
STRICT_AUDIT=1 run "$SCRIPT" --org ByronWilliamsCPA --output "$TEST_TMPDIR/out.csv"
assert_success
run cat "$TEST_TMPDIR/out.csv"
assert_output --partial "ByronWilliamsCPA/repo-dirty,2"
assert_output --partial "ByronWilliamsCPA/repo-clean,0"
}

@test "audit --skip-owners with a single owner (no commas) skips correctly" {
# SKIP_OWNERS framing is ",$VAR," so a single value becomes ",actions,"
# and matches an owner literally named "actions". Verifies the case
# statement handles the no-comma input.
_write_gh_stub
run "$SCRIPT" --org ByronWilliamsCPA --skip-owners 'actions' --output "$TEST_TMPDIR/out.csv"
assert_success
run cat "$TEST_TMPDIR/out.csv"
# repo-dirty has two third-party actions/* refs; both should now be skipped
assert_output --partial "ByronWilliamsCPA/repo-dirty,0"
}

@test "audit --skip-owners with empty string does not skip anything" {
# Empty SKIP_OWNERS becomes the case pattern ",,". No real owner string
# matches that, so both violations in repo-dirty must still be counted.
_write_gh_stub
run "$SCRIPT" --org ByronWilliamsCPA --skip-owners '' --output "$TEST_TMPDIR/out.csv"
assert_success
run cat "$TEST_TMPDIR/out.csv"
assert_output --partial "ByronWilliamsCPA/repo-dirty,2"
}

@test "audit --skip-owners with double internal commas still matches listed owners" {
# SKIP_OWNERS=",,actions,," builds the case framing ",,,actions,,," which
# contains noisy empty segments. Owner "actions" must still match the
# ",actions," substring without confusion from the adjacent empties.
# Both refs in repo-dirty (actions/checkout, actions/setup-python) share
# owner "actions" and should both be skipped.
_write_gh_stub
run "$SCRIPT" --org ByronWilliamsCPA --skip-owners ',,actions,,' \
--output "$TEST_TMPDIR/out.csv"
assert_success
run cat "$TEST_TMPDIR/out.csv"
assert_output --partial "ByronWilliamsCPA/repo-dirty,0"
}
104 changes: 104 additions & 0 deletions tests/update-pinned-actions.bats
Original file line number Diff line number Diff line change
Expand Up @@ -839,3 +839,107 @@ EOF
"$pin_tags_dir/tag-pinned.yml"
assert_success
}

@test "--pin-tags reports branch refs that carry an inline comment" {
# extract_branch_pins was updated to accept "# comment" suffixes; the
# existing branch-comment fixture (tag-pinned.yml's some-org/some-action@main)
# has no comment, so this path is unverified. Add a fixture with a
# comment-annotated branch ref and assert it surfaces in the WARN list.
_write_gh_stub_pin_tags
pin_tags_dir="$TEST_TMPDIR/pin_tags_workdir"
mkdir -p "$pin_tags_dir"
cat > "$pin_tags_dir/branch-with-comment.yml" <<'YML'
name: Fixture
on: push
jobs:
example:
runs-on: ubuntu-latest
steps:
- uses: some-org/branch-action@main # tracking head
- uses: actions/checkout@v4
YML

run "$SCRIPT" --pin-tags --workflows-dir "$pin_tags_dir"
assert_success
# The branch ref must appear in the WARN block even though it carries
# an inline "# tracking head" comment that the old regex would have
# treated as the line terminator.
assert_output --partial "some-org/branch-action@main"
}

@test "--pin-tags --apply on a workflow with no third-party refs succeeds without warnings" {
# Regression guard for the rc-aware extract_tag_pins/extract_branch_pins
# path: a fixture containing ONLY first-party refs (skipped by the
# owner allowlist) means grep matches lines but the awk filter emits
# nothing. The function must return 0 cleanly and the script must exit 0.
_write_gh_stub_pin_tags
pin_tags_dir="$TEST_TMPDIR/pin_tags_workdir"
mkdir -p "$pin_tags_dir"
cat > "$pin_tags_dir/all-first-party.yml" <<'YML'
name: Fixture
on: push
jobs:
example:
runs-on: ubuntu-latest
steps:
- uses: ByronWilliamsCPA/.github/.github/workflows/python-ci.yml@v1
- uses: williaby/.github/.github/workflows/release-tag.yml@v1
YML

run "$SCRIPT" --pin-tags --apply --workflows-dir "$pin_tags_dir"
assert_success
assert_output --partial "No third-party tag-pinned actions found"
# First-party refs must be untouched
run grep -F "ByronWilliamsCPA/.github/.github/workflows/python-ci.yml@v1" \
"$pin_tags_dir/all-first-party.yml"
assert_success
}

@test "--pin-tags --apply skips when annotated-tag dereferencing returns missing .object.sha" {
# Stub the two-step dereferencing path so the second call returns an
# object without `.sha`. jq emits "null" under -r; resolve_tag_sha must
# detect that and return empty, causing the action to be marked SKIPPED
# rather than writing "actions/checkout@null" into the workflow file.
local annotated_sha="bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb" # pragma: allowlist secret
cat > "$GH_BIN/gh" <<EOF
#!/usr/bin/env bash
set -e
case "\$*" in
*"release list --repo actions/checkout"*)
_apply_jq_flag '[{"tagName":"v4.3.0"}]' "\$@" ;;
*"release list --repo actions/setup-python"*)
_apply_jq_flag '[{"tagName":"v5.2.0"}]' "\$@" ;;
*"actions/checkout"*"git/refs/tags/v4.3.0"*)
_apply_jq_flag '{"object":{"type":"tag","sha":"${annotated_sha}"}}' "\$@" ;;
*"actions/checkout"*"git/tags/${annotated_sha}"*)
# Malformed response: object present but no .sha field
_apply_jq_flag '{"object":{}}' "\$@" ;;
*"actions/setup-python"*"git/refs/tags/v5.2.0"*)
_apply_jq_flag '{"object":{"type":"commit","sha":"$SETUP_PYTHON_V5_LATEST_SHA"}}' "\$@" ;;
*"auth status"*) exit 0 ;;
*) echo "unexpected gh call: \$*" >&2; exit 1 ;;
esac
EOF
chmod +x "$GH_BIN/gh"

pin_tags_dir="$TEST_TMPDIR/pin_tags_workdir"
mkdir -p "$pin_tags_dir"
cp "$FIXTURES/tag-pinned.yml" "$pin_tags_dir/"

run "$SCRIPT" --pin-tags --apply --workflows-dir "$pin_tags_dir"
assert_success
assert_output --partial "SKIP: cannot resolve SHA"

# The original v4 ref must remain untouched: the dereferencing failed,
# so no substitution should have happened for actions/checkout.
run grep -F "actions/checkout@v4" "$pin_tags_dir/tag-pinned.yml"
assert_success
# The literal string "null" must NOT have leaked into the workflow file
run grep -F "actions/checkout@null" "$pin_tags_dir/tag-pinned.yml"
assert_failure
# actions/setup-python had a clean lightweight-tag path; it should be
# converted normally
run grep -F "actions/setup-python@$SETUP_PYTHON_V5_LATEST_SHA" \
"$pin_tags_dir/tag-pinned.yml"
assert_success
}