Skip to content

feat: add preserve_anthropic_api_key option for API-key auth (#333)#335

Open
rsolmano wants to merge 4 commits intoumputun:masterfrom
rsolmano:preserve-anthropic-api-key
Open

feat: add preserve_anthropic_api_key option for API-key auth (#333)#335
rsolmano wants to merge 4 commits intoumputun:masterfrom
rsolmano:preserve-anthropic-api-key

Conversation

@rsolmano
Copy link
Copy Markdown

@rsolmano rsolmano commented May 6, 2026

Fixes #333.

Summary

ralphex unconditionally stripped ANTHROPIC_API_KEY from the child claude process environment. This broke users who authenticate Claude Code via ANTHROPIC_API_KEY rather than OAuth/keychain — claude started unauthenticated, printed Not logged in, and ralphex exited matching that string against claude_error_patterns.

The strip was deliberate: an OAuth-authenticated user with ANTHROPIC_API_KEY exported for the Anthropic SDK would otherwise have claude silently switch to the API key and bill a different account. Removing the strip unconditionally is therefore not acceptable — this PR makes the strip behavior opt-out instead.

New preserve_anthropic_api_key config option and --preserve-anthropic-api-key CLI flag, default false (existing OAuth-protective behavior preserved). When enabled, ANTHROPIC_API_KEY is passed through to the child. CLAUDECODE continues to be stripped unconditionally (prevents nested-session errors).

Changes

  • pkg/executor/executor.go — new claudeChildEnv(env, preserveAPIKey) helper; preserveAPIKey field on execClaudeRunner; PreserveAnthropicAPIKey field on ClaudeExecutor.
  • pkg/config/{config,values}.go — config field with INI parsing. The *Set merge sentinel lives only on Values (load-bearing for local-overrides-global merge); Config carries the resolved bool only, matching the move_plan_on_completion precedent.
  • pkg/config/defaults/config — commented preserve_anthropic_api_key template entry with rationale.
  • cmd/ralphex/main.go — new --preserve-anthropic-api-key flag, override in applyCLIOverrides.
  • pkg/processor/runner.go — propagation to both task and review ClaudeExecutor instances.
  • README.md, CLAUDE.md, llms.txt — docs.

Tests (TDD)

All new tests written before implementation:

  • TestClaudeChildEnv (5 subtests) — covers default/preserve behavior, partial-key-match safety, missing-key edge case.
  • TestLoad_PreserveAnthropicAPIKey (3 subtests) + TestLoad_PreserveAnthropicAPIKey_InvalidValue — INI parsing.
  • TestPreserveAnthropicAPIKeyFlag (3 subtests) — CLI flag override semantics.

Test plan

  • make test — all green, 87.0% total coverage
  • make lint — 0 issues
  • make fmt — clean
  • go test -race on changed packages — clean
  • End-to-end toy-project run (./scripts/internal/prep-toy-test.sh) — recommended before merge by maintainer

…#333)

ralphex unconditionally stripped ANTHROPIC_API_KEY from the child claude
process environment. This broke users who authenticate Claude Code via
ANTHROPIC_API_KEY rather than OAuth/keychain — claude started without
credentials, printed "Not logged in", and ralphex bailed out matching
that string against claude_error_patterns.

The strip was deliberate: an OAuth-authenticated user with ANTHROPIC_API_KEY
exported for the SDK would otherwise have claude silently switch to the API
key and bill a different account. Removing the strip unconditionally is not
acceptable, so this change makes it opt-in.

Add `preserve_anthropic_api_key` config option and `--preserve-anthropic-api-key`
CLI flag, default false (existing OAuth-protective behavior). When enabled,
ANTHROPIC_API_KEY is passed through to the claude child. CLAUDECODE continues
to be stripped unconditionally (prevents nested-session errors).

Implementation:
- new claudeChildEnv() helper in pkg/executor/executor.go
- preserveAPIKey field on execClaudeRunner, PreserveAnthropicAPIKey on ClaudeExecutor
- config field with INI parsing, merge sentinel only on Values (Config carries the
  resolved bool, matching the move_plan_on_completion pattern)
- CLI flag + applyCLIOverrides plumbing
- propagation to both task and review ClaudeExecutor instances in runner.go

Tests: TestClaudeChildEnv (5 subtests), TestLoad_PreserveAnthropicAPIKey (3
subtests + invalid-value case), TestPreserveAnthropicAPIKeyFlag (3 subtests).

Fixes umputun#333
@rsolmano rsolmano requested a review from umputun as a code owner May 6, 2026 20:30
Copy link
Copy Markdown
Owner

@umputun umputun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

couple things, mostly the first two matter:

  1. CLI override is one-way and that's a footgun for billing. cmd/ralphex/main.go:1272-1274 does if o.PreserveAnthropicAPIKey { cfg.PreserveAnthropicAPIKey = true }. With preserve_anthropic_api_key = true in global/local config, there's no per-run CLI way back to the safer strip behavior. For finalize that pattern is just an annoyance; for ANTHROPIC_API_KEY it's billing. A user who turned this on once for an API-key project and later runs ralphex inside an OAuth context bills the wrong account silently. The siblings just below (Wait, SessionTimeout, IdleTimeout) all track *Set to handle explicit zero/false; this one should too: track o.preserveAnthropicAPIKeySet and apply explicit false as override.

  2. Show the active auth mode in the startup banner when the flag is on. displayMeta (cmd/ralphex/main.go:497) prints plan, branch, progress log. Nothing tells the user that ANTHROPIC_API_KEY is being passed through. Adding a line like auth: API key passthrough (only when preserve_anthropic_api_key is true; nothing when false to avoid clutter) gives the user a chance to spot a wrong-context run before claude bills the wrong account. This matters more than #1 because it's visible per-run.

  3. No test for the local-overrides-global merge path. TestLoad_PreserveAnthropicAPIKey only covers single-config parsing. The whole reason PreserveAnthropicAPIKeySet exists is the merge block in mergeExtraFrom, but nothing in this PR exercises global-true + local-false → false. If a future refactor drops the sentinel handling, no test catches it. TestValues_mergeFrom_TaskModel (values_test.go:2310+) is the pattern.

  4. nit: subtest names in TestPreserveAnthropicAPIKeyFlag use snake_case (flag_enables_when_config_disabled); rest of main_test.go uses space-separated lowercase phrases.

  5. nit: ClaudeExecutor.PreserveAnthropicAPIKey reads long. The type already implies Claude and the runner field is preserveAPIKey. PreserveAPIKey would match siblings (Model, Effort, Args).

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds an opt-in configuration/CLI switch to preserve ANTHROPIC_API_KEY in the child claude process environment, restoring API-key-based Claude Code authentication while keeping the existing safer default (strip the key) for OAuth/keychain users.

Changes:

  • Introduces preserve_anthropic_api_key config + --preserve-anthropic-api-key CLI flag (default false).
  • Refactors claude child env construction into a helper and plumbs the option through runner → executor.
  • Adds targeted unit tests for env filtering, config parsing, and CLI override semantics; updates docs/templates.

Reviewed changes

Copilot reviewed 12 out of 12 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
README.md Documents the new CLI flag and config option in usage/examples and option tables.
pkg/processor/runner.go Propagates PreserveAnthropicAPIKey into task and review ClaudeExecutor instances.
pkg/executor/executor.go Adds claudeChildEnv(..., preserve) helper and executor/runner plumbing to control env stripping.
pkg/executor/executor_test.go Adds TestClaudeChildEnv coverage for preserve/default behavior and edge cases.
pkg/config/values.go Adds INI parsing + merge sentinel (PreserveAnthropicAPIKeySet) on Values.
pkg/config/defaults/config Adds commented template entry explaining when/why to enable preserving the API key.
pkg/config/config.go Adds resolved PreserveAnthropicAPIKey field to runtime config and wires it from loaded values.
pkg/config/config_test.go Adds config load tests for preserve option (valid/invalid values).
cmd/ralphex/main.go Adds --preserve-anthropic-api-key flag and applies it as a CLI override.
cmd/ralphex/main_test.go Adds tests verifying the CLI flag override semantics.
llms.txt Updates user-facing usage notes re: preserving ANTHROPIC_API_KEY.
CLAUDE.md Updates contributor documentation describing the new option and plumbing.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread CLAUDE.md Outdated
- `session_timeout` config option: per-session timeout for claude (e.g., "30m", "1h"). Kills hanging sessions and continues to next iteration. CLI flag `--session-timeout` takes precedence. Disabled by default
- `idle_timeout` config option: kills claude sessions when no output for specified duration (e.g., "5m"). Resets on each output line, only fires when session goes silent. CLI flag `--idle-timeout` takes precedence. Disabled by default
- `move_plan_on_completion` config option: controls whether completed plans move to `docs/plans/completed/` on success. Default `true`. Disable for workflows that manage plan lifecycle externally (spec-driven tooling with separate archive steps)
- `preserve_anthropic_api_key` config option / `--preserve-anthropic-api-key` CLI flag: when true, `ANTHROPIC_API_KEY` is passed through to the child claude process. Required for users who authenticate Claude Code via API key rather than OAuth/keychain. Default `false` strips the key so a host-set value cannot silently override OAuth credentials and bill a different account. The merge sentinel `PreserveAnthropicAPIKeySet` lives only on `Values` (load-bearing for local-overrides-global merge); `Config` carries the resolved bool only. Plumbed through `Config.PreserveAnthropicAPIKey` → `runner.go:130/157` → `ClaudeExecutor.PreserveAnthropicAPIKey` → `execClaudeRunner.preserveAPIKey` → `claudeChildEnv()` in `pkg/executor/executor.go`. CLAUDECODE is always stripped regardless of this flag (prevents nested-session errors)
rsolmano and others added 2 commits May 7, 2026 09:23
- Add --no-preserve-anthropic-api-key companion flag for per-run safety
  override (go-flags rejects --bool=value, so a paired flag is the
  idiomatic way to express explicit-false override). On conflict the
  no-form wins on safety bias to avoid the billing footgun where a
  config-set true sticks across an OAuth-context run.
- Surface ANTHROPIC_API_KEY passthrough in the startup banner (warn
  color) only when enabled, so users notice wrong-context runs before
  claude bills.
- Add Values-layer merge tests covering global+local interactions and
  the *Set sentinel behavior (the local-explicit-false-overrides-global
  case is the safety reason the sentinel exists; nothing tested it).
- Rename ClaudeExecutor.PreserveAnthropicAPIKey to PreserveAPIKey to
  match sibling field style (Model, Effort, Args). User-facing names
  (config key, CLI flag, Config field) keep "anthropic" for clarity.
- Clean up subtest names to space-separated phrases.
- Drop hard-coded line-number references in CLAUDE.md plumbing notes
  (paths only, since line numbers drift).

Refs umputun#335
@rsolmano
Copy link
Copy Markdown
Author

rsolmano commented May 7, 2026

Thanks for the review. Pushed ff5014a addressing all five points.

On #1 (two-way override). Quick caveat on the approach: jessevdk/go-flags rejects --bool=value syntax, so tracking *Set alone wouldn't surface explicit-false on the CLI — the parser errors with bool flag cannot have an argument. To preserve the intent (per-run override back to safe-strip), I added a paired --no-preserve-anthropic-api-key flag, which matches the project's existing --skip-finalize convention for one-way negative flags. On conflict (both forms passed) the no-form wins on safety bias. Tests cover all four corners.

If you'd prefer a different mechanism (e.g., custom unmarshaler on a tristate type, or the same companion flag named differently like --strip-anthropic-api-key), happy to reshape.

#2 — banner line printed only when passthrough is on, in Warn color so it stands out. Tested with stdout capture.

#3 — added TestValues_mergeFrom_PreserveAnthropicAPIKey with 5 subtests, including the explicit-false-overrides-true safety case and an end-to-end Load() sanity check that catches future drops of the *Set handling.

#5 — renamed only the ClaudeExecutor field to PreserveAPIKey. User-facing names (config key, CLI flag, Config.PreserveAnthropicAPIKey) keep "anthropic" so they self-document.

#4 + Copilot nit — applied.

Copy link
Copy Markdown
Owner

@umputun umputun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

scratch round-1 #1. The banner from #2 is enough on its own. If passthrough is on, the user sees auth: ANTHROPIC_API_KEY passthrough enabled every run and has a chance to bail before claude switches off the Claude Code subscription onto per-token API charges. A second --no- flag on top of that is more confusion than safety. Pls revert to the single --preserve-anthropic-api-key flag (your original shape) and drop --no-preserve-anthropic-api-key, the matching applyCLIOverrides branch, and the related tests. Keep the merge sentinel on Values though, the local-overrides-global INI merge still needs it.

One real thing left: plan mode hides the passthrough banner. runPlanMode at cmd/ralphex/main.go:998 builds startupInfo without PreserveAnthropicAPIKey, so ralphex --plan "..." with preserve_anthropic_api_key=true runs claude with the key passed through but never prints the auth: line. The executor itself is fine (processor.New at pkg/processor/runner.go:143 always wires PreserveAPIKey from AppConfig), so this is purely a visibility gap. One-line fix plus a test mirroring main_test.go:984-1013.

This matters more now since with #1 reverted the banner becomes the only safety surface for the wrong-context risk.

Per maintainer feedback on PR umputun#335:

- Revert the paired --no-preserve-anthropic-api-key flag and its override
  branch in applyCLIOverrides. The startup banner is sufficient as a
  per-run safety surface; a second flag adds confusion without meaningful
  safety gain. Drops the two related subtests.

- Fix plan-mode banner gap: runPlanMode built startupInfo without
  PreserveAnthropicAPIKey, so `ralphex --plan "..."` with passthrough
  enabled ran claude with the key but never printed the auth line. Plumb
  the field into the plan-mode call site and emit the auth conditional
  inside the plan-mode branch of printStartupInfo, mirroring the full-mode
  shape. Adds a subtest covering the plan-mode visibility.

With the bug fixed, the banner is now universally visible whenever
passthrough is active, which is what the maintainer's argument relies on.

Refs umputun#335
@rsolmano
Copy link
Copy Markdown
Author

rsolmano commented May 7, 2026

Pushed 3b7f349.

  • Reverted --no-preserve-anthropic-api-key flag, override branch, and the two related subtests. Single --preserve-anthropic-api-key is back.
  • Fixed the plan-mode banner gap: runPlanMode now plumbs PreserveAnthropicAPIKey into startupInfo, and printStartupInfo's plan-mode branch emits the same auth: warn line as the full-mode branch. Subtest added.
  • Merge sentinel still on Values only; Config carries the resolved bool.
  • README/llms.txt/CLAUDE.md mentions of the removed flag stripped.

make fmt/make lint/make test clean, race detector clean on touched packages.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add a flag to use API-key auth for claude code

3 participants