Skip to content

refactor: migrate all LLM calls to Claude Agent SDK#51

Draft
joshbouncesecurity wants to merge 13 commits intoknostic:masterfrom
joshbouncesecurity:refactor/issue16-16-sdk-migration
Draft

refactor: migrate all LLM calls to Claude Agent SDK#51
joshbouncesecurity wants to merge 13 commits intoknostic:masterfrom
joshbouncesecurity:refactor/issue16-16-sdk-migration

Conversation

@joshbouncesecurity
Copy link
Copy Markdown
Contributor

Summary

Routes all LLM calls (analyze, enhance, verify, report, context, remediation) through the Claude Agent SDK (claude-agent-sdk) instead of the anthropic API.

Benefits:

  • Single backend: every LLM call goes through one SDK.
  • Local Claude Code support: SDK natively handles both API-key auth and local Code session auth.
  • Native tools: Read / Grep / Glob / Bash are SDK-native; the manual multi-turn tool dispatch loop in finding_verifier.py is replaced by run_native_verification.
  • Accurate cost tracking: uses ResultMessage.total_cost_usd.
  • SDK-surfaced errors: AssistantMessage.error (Literal: rate_limit/authentication_failed/billing_error/invalid_request/server_error/unknown) is mapped to a typed utilities/sdk_errors.py taxonomy and centrally reported through the existing GlobalRateLimiter.

Dependency change: drops anthropic>=0.40.0, adds claude-agent-sdk>=0.1.48. The new tests/test_declared_dependencies.py regression-guard fails CI if any imported distribution isn't declared in pyproject.toml, preventing silent re-introduction on future merges.

Depends on #42 (item 15 — centralize model IDs). Please merge that first.

Addresses item 16 from #16 (does not close the issue).

Commits in this PR

The migration is structured as 10 reviewable commits cherry-picked from the fork's incremental ports. Each compiles in isolation; anthropic stays declared in pyproject.toml until commit 8 (PR #38) so intermediate states still build.

  1. Regression guard for declared dependencies (test_declared_dependencies.py).
  2. Add utilities/sdk_errors.py taxonomy.
  3. Wire AssistantMessage.error surfacing into llm_client._run_query, central rate-limit reporting through GlobalRateLimiter. (Also introduces the SDK helper functions: _build_env, _build_options, _run_query, _run_query_sync, run_native_verification.)
  4. Port report/generator.py (single-turn, smallest site).
  5. Port context_enhancer._build_error_info isinstance chain to sdk_errors.*.
  6. Port finding_verifier.py (manual tool loop -> run_native_verification with SDK native tools).
  7. Port agentic_enhancer/agent.py; absorb shared_client removal in context_enhancer.py.
  8. Drop anthropic>=0.40.0 from pyproject.toml; clean up openant/cli.py remediation-guidance call.
  9. Port the upstream-only call sites the fork chain didn't touch: application_context.py, generate_report.py, stage1_consistency.py. Re-implement AnthropicClient on top of the SDK helpers so the legacy class name keeps working.
  10. Add SDK verification prompts (VERIFICATION_JSON_SCHEMA, get_native_claude_verification_prompt), TokenTracker.restore_from, and centralize remaining model literals.

Test plan

  • tests/test_declared_dependencies.py passes locally (8 tests).
  • git grep "^import anthropic\|^from anthropic" -- '*.py' ':!tests/' is empty.
  • Existing pytest suite passes locally: 149 passed, 10 skipped (tests/test_go_cli.py excluded — needs Go binary).
  • CI: green on Linux/macOS/Windows.
  • Manual: openant analyze against a small fixture with ANTHROPIC_API_KEY exported — runs to completion, cost reported, no anthropic traceback.
  • Manual: openant analyze with no API key but logged into Claude Code locally — uses the local session.

joshbouncesecurity and others added 11 commits May 4, 2026 21:15
Adds libs/openant-core/utilities/model_config.py exposing MODEL_PRIMARY,
MODEL_AUXILIARY, and MODEL_DEFAULT, and replaces hardcoded
claude-opus-* / claude-sonnet-* string literals across the codebase
with imports from that module. Future model bumps become one-line.

Defaults match the canonical strings already in upstream
(claude-opus-4-20250514 and claude-sonnet-4-20250514). The
claude-opus-4-6 alias call sites are unified to MODEL_PRIMARY; both
forms route to the same Claude Opus 4 model on the Anthropic API
(the upstream MODEL_PRICING table already mapped both keys to the
same prices), so behavior is unchanged.

Adds tests:

  - tests/test_model_config.py asserts the constants exist, are
    non-empty strings, and match the expected claude-(opus|sonnet|haiku)-...
    model-id format.
  - A regression test scans every libs/openant-core/**/*.py file and
    fails if any hardcoded claude-opus-*/claude-sonnet-* literal
    reappears outside model_config.py itself.

Addresses item 15 from #16 (does not close).
)

PR #25 claimed to remove `anthropic` from pyproject.toml but left four
files still using it (`utilities/context_enhancer.py`,
`utilities/finding_verifier.py`, `utilities/agentic_enhancer/agent.py`,
`report/generator.py`). Every clean install fails at
`from utilities ...` because `utilities/__init__.py` eagerly loads
`context_enhancer`. The upstream merge also added a Zig parser that
imports `tree_sitter_zig` without declaring it.

- Re-declare `anthropic>=0.40.0` and add `tree-sitter-zig>=0.20.0` to
  pyproject.toml so the declared deps match actual imports.
- Delete requirements.txt — it was a hand-maintained duplicate of
  pyproject.toml's deps, and the drift is exactly what let #25 slip
  through CI. Single source of truth now.
- Update CI to install via `pip install -e ".[dev]"` so pyproject.toml
  is exercised on every run.
- Add `tests/test_declared_dependencies.py`: a static check that every
  third-party import under the packaged dirs maps to a declared
  dependency, plus a smoke test that each packaged top-level module
  imports cleanly. Catches the regression class directly.
- Update README install instructions to match.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
(cherry picked from commit c11cd50)
feat: add sdk_errors taxonomy for SDK-surfaced LLM errors
(cherry picked from commit 1fe913e)
feat: surface SDK API errors as typed sdk_errors exceptions
(cherry picked from commit 0ee9af8)
…tor-sdk

refactor: port report/generator.py to Claude Agent SDK
(cherry picked from commit 0b5675d)
…r-sdk

refactor: port context_enhancer error classifier to sdk_errors
(cherry picked from commit 27e5481)
…native

refactor: port finding_verifier.py to Claude Agent SDK native tools
(cherry picked from commit 09073d7)
…t-sdk

refactor: port agentic_enhancer to Claude Agent SDK native tools
(cherry picked from commit b73c618)
chore: drop anthropic dependency — SDK migration complete
(cherry picked from commit 67376c9)
The fork's PR chain (#30-#38) ported the main LLM call sites but left
upstream-only callers (application_context, generate_report, stage1_consistency)
on the legacy anthropic SDK. Once #38 drops the anthropic dependency these
import paths fail, so finish the migration here.

- llm_client.AnthropicClient: re-implement on top of the SDK helpers
  (_run_query_sync / _build_options) so the legacy class name keeps
  working without re-importing anthropic. Adds optional cost_usd to
  TokenTracker.record_call so SDK-reported total_cost_usd flows through.
- context/application_context.py: Anthropic() -> AnthropicClient.
- generate_report.py: anthropic.Anthropic -> AnthropicClient.
- utilities/stage1_consistency.py: client.messages.create ->
  client.analyze_sync (callers already pass an AnthropicClient).
- Tests: update test_silent_401 to mock _run_query_sync raising
  sdk_errors.AuthError, and test_disclosure_source_fidelity to patch
  generator.AnthropicClient instead of the removed anthropic.Anthropic.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- prompts/verification_prompts.py: add VERIFICATION_JSON_SCHEMA and
  get_native_claude_verification_prompt — required by finding_verifier's
  run_native_verification call. Upstream had only the legacy custom-tool
  prompt; the SDK port needs the structured-output prompt + schema.
- tests/test_local_claude.py: replace hardcoded claude-opus/sonnet
  literals with MODEL_PRIMARY/MODEL_AUXILIARY imports so the
  test_no_hardcoded_model_strings_outside_model_config regression test
  (from item 15) passes.
- utilities/llm_client.py: add TokenTracker.restore_from for checkpoint
  resume, and replace hardcoded model literal in docstring.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@joshbouncesecurity
Copy link
Copy Markdown
Contributor Author

joshbouncesecurity commented May 4, 2026

Superseded by newer review

Previous review (click to expand)

🤖 Automated Claude Code Review

@joshbouncesecurity
Copy link
Copy Markdown
Contributor Author

🤖 Automated Claude Code Review

Context: This is a follow-up (Round 2) review covering commit b3c1d9f ("fix(sdk-migration): tighten error handling and auth-mode plumbing"). The previous recorded review (commit 521ca0b) had an empty body, so findings are numbered starting at F1. Where a concern was introduced before 521ca0b and left unaddressed, it is noted as a pre-existing finding confirmed by this review. Where b3c1d9f introduced or fixed something, that is called out explicitly.


Findings

F1 — _log_auth_mode prints wrong message when no API key and not in local mode (regression introduced in b3c1d9f)

In utilities/llm_client.py the elif api_key / else branch now reads:

elif api_key:
    print("Using Claude Agent SDK (API key mode)", file=sys.stderr)
else:
    print("Using Claude Agent SDK (local session)", file=sys.stderr)

When neither OPENANT_LOCAL_CLAUDE=true nor ANTHROPIC_API_KEY is set, the code falls into the else branch and tells the user "local session" — but the subsequent SDK call will fail with an auth error because there is no key and no local session configured. The message is actively misleading; it should say something like "no auth configured — set ANTHROPIC_API_KEY or OPENANT_LOCAL_CLAUDE=true". The original pre-fix code had the same logic bug; b3c1d9f restructured without fixing it.

F2 — max_tokens silently dropped in AnthropicClient._call with no warning

The docstring in _call acknowledges that max_tokens is a no-op because ClaudeAgentOptions doesn't expose it. However, callers throughout the codebase pass explicit values (max_tokens=4096, max_tokens=2000, max_tokens=2048) that are now silently ignored. For report generation and context generation these values were safety ceilings to cap spending; without them the SDK lets the model generate uncapped output. At minimum, a one-time warnings.warn at the call site would surface the regression in tests/logs. More actionably, if ClaudeAgentOptions exposes max_output_tokens or similar in any version, this should be wired up. The comment says "See PR #51 for context" which is circular self-reference.

F3 — test_sdk_error_surfacing.py reset fixture leaves GlobalRateLimiter._instance in inconsistent state

The updated reset_rate_limiter fixture in test_sdk_error_surfacing.py (changed in b3c1d9f) does:

rate_limiter.reset_rate_limiter()    # resets internal state but keeps _rate_limiter alive
rate_limiter._rate_limiter = None    # nulls the module-level handle
rate_limiter.GlobalRateLimiter._instance = None  # nulls the class-level Borg singleton

The reset_rate_limiter() function only calls .reset() on the existing instance — it does not set _rate_limiter = None. So the order here is: reset (keeps instance alive), null module handle, null class singleton. Now _rate_limiter is None and GlobalRateLimiter._instance is None, so the next get_rate_limiter() call will create a fresh instance — that part is correct. But the fixture calls reset_rate_limiter() after nulling _rate_limiter in the teardown path too. On teardown: reset_rate_limiter() acquires _config_lock, checks _rate_limiter is not None — but _rate_limiter was already set to None by the previous line in the yield-cleanup, so the reset() call is a no-op. That's harmless but the double-null pattern is fragile and the comment explaining it doesn't match the actual execution order. If a test creates a new limiter and mocks report_rate_limit on it, the teardown won't reset that instance's backoff state. The correct fix is to null _rate_limiter and _instance first, then let get_rate_limiter() create a fresh one, and skip calling the public reset_rate_limiter() entirely.

F4 — _build_error_info process-level error classification uses deferred import inside a hot path with no caching

In utilities/context_enhancer.py (and now also finding_verifier.py which has an identical copy), the else branch of _build_error_info does:

try:
    from claude_agent_sdk import (
        CLIConnectionError, CLIJSONDecodeError, CLINotFoundError, ProcessError,
    )
    if isinstance(exc, CLIConnectionError): ...
except ImportError:
    pass

This is called on every exception (potentially hundreds per run under parallel workers). from X import Y inside a function on a hot path that can be called hundreds of times is fine in CPython because the import system caches after the first load — but the wrapping try/except ImportError and the repeated module attribute lookups add unnecessary overhead. More importantly, having two identical copies of this logic (context_enhancer and finding_verifier) means the next SDK version bump that renames CLIConnectionErrorCliConnectionError will need to be fixed in two places. Extract to a shared helper in sdk_errors.py (e.g., classify_sdk_process_error(exc) -> dict | None).

F5 — CLINotFoundError classified as auth is semantically wrong and will confuse retry logic

_build_error_info maps CLINotFoundError to info["type"] = "auth". The comment says "caller-fixable config issue", which is true, but is_retryable_error() in rate_limiter.py presumably does not retry auth-type errors. That's correct for a missing binary — but _build_error_info is also used to classify errors for the context enhancer's retry policy. If the claude binary disappears mid-run (e.g., during an OS update), all in-flight work will be abandoned with an opaque "auth error" rather than a message pointing at the real problem. A dedicated type like "cli_not_found" or at least a distinct message would help operators. At minimum, the log/error message reaching the user should say "claude binary not found" rather than the auth-error path it currently follows.

F6 — verify_result calls get_rate_limiter().wait_if_needed() before the lazy ClaudeSDKError import, but the pre-existing lock may block for minutes

In finding_verifier.py:

get_rate_limiter().wait_if_needed()

try:
    from claude_agent_sdk import ClaudeSDKError
except ImportError:
    ClaudeSDKError = ()

The wait_if_needed() call can block (sleeping) for up to backoff_seconds (default 30s, configurable up to much longer). The import of ClaudeSDKError happens after the wait. This is fine at runtime but is a subtle ordering issue if someone moves the import block upward without understanding the design intent. More importantly: _run_query_sync (called in the try below) also calls wait_if_needed() internally via _call and via the direct call added to verify_result. This means two wait-if-needed calls happen back-to-back for each verification: one in verify_result and one inside run_native_verification_run_query_sync. The outer one is redundant and should be removed from verify_result (the inner call in the SDK layer is sufficient).

F7 — _parse_freetext_verdict can return wrong agree value when verdict matches original finding coincidentally

In finding_verifier._parse_freetext_verdict:

if agree is None:
    agree = correct_finding == original_finding

If the free-text response doesn't contain "agree" or "disagree" but does contain e.g. "VULNERABLE", and the original finding was also "vulnerable", agree is set to True. This is a reasonable heuristic but it's inverted for the important case: when the model writes "I find this is VULNERABLE" as a Stage-2 upgrade from "safe", the text won't say "disagree" explicitly, so agree will be inferred as False (correct_finding="vulnerable" != original_finding="safe"). But when Stage-2 echoes "SAFE" to confirm Stage-1's "safe" finding, agree is inferred as True — also correct. The edge case is when the model says "PROTECTED" and the original finding was "bypassable": agree becomes False, which is correct. This logic is actually fine in most cases but is undocumented and untested for the case where agree keyword is absent. Add a test covering the "no agree/disagree keyword" path.

F8 — _extract_json regex for code block extraction won't handle multi-line JSON correctly

In FindingVerifier._extract_json:

json_block = re.search(r'```(?:json)?\s*(\{.*?\})\s*```', text, re.DOTALL)

\{.*?\} with re.DOTALL uses non-greedy matching. For a JSON object like:

{
  "agree": true,
  "correct_finding": "safe",
  "explanation": "nested {} braces here"
}

The non-greedy .*? will stop at the first } it finds — which is the end of the explanation string's embedded {}. The match group will be {"explanation": "nested {} braces here"} instead of the full object. The correct approach is the same find('{')/rfind('}') pattern used in the fallback branch, not a greedy/non-greedy regex. This is a latent correctness bug that will manifest whenever Claude's explanation text contains braces.

F9 — ContextAgent passes cwd=repo_path and add_dirs=[repo_path] but repo_path can be None when self.index.repo_path is None

In agentic_enhancer/agent.py:

repo_path = str(self.index.repo_path) if self.index.repo_path else None

options = _build_options(
    ...
    add_dirs=[repo_path] if repo_path else [],
    cwd=repo_path,
    ...
)

When repo_path is None, cwd=None is passed to ClaudeAgentOptions. If ClaudeAgentOptions does not accept None for cwd (or passes it directly to subprocess), this will fail with a confusing error. The add_dirs guard is correct; the same guard should be applied to cwd: cwd=repo_path if repo_path else None is fine only if ClaudeAgentOptions explicitly supports cwd=None as "use default". This should be verified against the SDK contract or the cwd kwarg should be omitted from the _build_options call when repo_path is None (use **({"cwd": repo_path} if repo_path else {})).

F10 — token/cost tracking double-counts in verify_result`

In finding_verifier.verify_result, the tokens are recorded via:

self.tracker.record_call(
    model=VERIFIER_MODEL,
    input_tokens=input_tokens,
    output_tokens=output_tokens,
    cost_usd=result.get("cost_usd"),
)

But run_native_verification in llm_client.py does not record to any tracker — it returns a raw dict. Good so far. However, _run_query_sync (which run_native_verification calls) does not record to the tracker either — it just returns (result_message, last_text). So there's no double-counting here. But for AnthropicClient._call, the tracker records are made inside _call. If someone calls _consistency_client.analyze_sync (which wraps _call), the tracker records once in _call and then _check_consistency does not record again. That's also correct. This is fine — but a clarifying comment noting that run_native_verification does NOT auto-record to the tracker (unlike AnthropicClient._call) would prevent future accidental double-recording.

F11 — test_declared_dependencies.py does not map tree-sitter-zig in DIST_TO_IMPORT

pyproject.toml adds tree-sitter-zig>=0.20.0 and the Zig parser imports tree_sitter_zig. The test's _dist_name_to_import normalizes tree-sitter-zigtree_sitter_zig (hyphen→underscore), so the PyPI name correctly resolves to the import name without a manual mapping entry. This is fine and works correctly. However, the other tree-sitter entries (tree-sitter-c, tree-sitter-cpp, etc.) are listed in DIST_TO_IMPORT explicitly — making the omission of tree-sitter-zig look like an oversight even though it's functionally correct. Add it to DIST_TO_IMPORT alongside the other tree-sitter entries for consistency, or add a comment explaining why tree-sitter-zig doesn't need a manual entry when the others do.

F12 — README.md still shows echo "ANTHROPIC_API_KEY=..." as the only auth method

The quick-start in libs/openant-core/README.md was updated to use pip install -e . but the "Set API key" section still only mentions ANTHROPIC_API_KEY. There is no mention of OPENANT_LOCAL_CLAUDE=true. Given that local Claude Code auth is now a first-class path (explicitly supported in _build_env, _check_api_key, and _log_auth_mode), the README should document it.

F13 — openant/cli.py remediation path creates AnthropicClient inline without a local-mode API-key check

In the remediation-guidance block of cmd_report_data, the old code explicitly checked client = anthropic.Anthropic() which would fail fast if the key was missing. The new code creates AnthropicClient(model=MODEL_AUXILIARY) which logs the auth mode but proceeds. For the API-key path this is fine — the SDK call will fail later with an AuthError. But _check_api_key() is only called in report/generator.py (via generate_summary_report / generate_disclosure), not in cmd_report_data. So running openant report without any auth configured will proceed past the guard and fail mid-generation rather than failing fast with a helpful message. The fix is to call _check_api_key() (or an equivalent guard) at the top of cmd_report_data, or at least before the LLM call that generates remediation HTML.


Positive Notes

  • The double-checked locking pattern in _log_auth_mode (if logged → lock → if logged → body) is correct and standard for this use case. Using a module-level flag + lock avoids re-printing under the parallel-worker load described in the docstring.
  • The exception hierarchy in _build_error_info correctly distinguishes between API-reported errors (surfaced as typed sdk_errors.* exceptions by _run_query) and process-level SDK errors (CLIConnectionError etc.), mapping each to the appropriate retry category.
  • The lazy import of ClaudeSDKError in finding_verifier is a reasonable approach that keeps the module importable on non-LLM paths.
  • The reset_rate_limiter docstring expansion in b3c1d9f accurately explains the two-level singleton design, which is genuinely subtle and worth documenting.
  • The _check_api_key() docstring was correctly updated to reflect the two-mode auth contract, including the OPENANT_LOCAL_CLAUDE path — a concrete improvement over the previous one-liner.
  • Replacing (RuntimeError, FileNotFoundError, TimeoutError) with (RuntimeError, FileNotFoundError, TimeoutError, ClaudeSDKError) in the verifier's except clause is a correct and meaningful widening of the catch.

…fication

Round 2 of iterative review on PR knostic#51:

- llm_client._log_auth_mode: when neither ANTHROPIC_API_KEY nor
  OPENANT_LOCAL_CLAUDE=true is set, print a Warning explaining the SDK
  will rely on the local claude CLI's own auth state instead of falsely
  claiming "local session" mode.
- context_enhancer._build_error_info: drop the CLINotFoundError -> auth
  mapping. Missing claude binary is an environmental issue, not an API
  auth failure - leaving type as unknown keeps the diagnostic accurate
  while exception_class still pinpoints the cause.
- tests/test_declared_dependencies DIST_TO_IMPORT: add tree-sitter-zig
  entry for consistency with sibling tree-sitter-* entries (the fallback
  hyphen-to-underscore conversion already produced the correct name, but
  explicit listing matches convention).

Tests: 149 passed, 22 skipped.
@joshbouncesecurity
Copy link
Copy Markdown
Contributor Author

Manual verification

Largest PR — full LLM call surface migrated. Requires either ANTHROPIC_API_KEY OR a local Claude Code session.

  • API-key path: export ANTHROPIC_API_KEY=..., run openant analyze <small-fixture>: completes, cost reported, no anthropic-related traceback.
  • Local session path: unset ANTHROPIC_API_KEY, be logged into Claude Code locally, export OPENANT_LOCAL_CLAUDE=true, re-run openant analyze: uses the local CLI.
  • No-auth path: unset both — clear warning banner stating the SDK will rely on local CLI auth state (no silent crash).
  • openant verify: completes; native SDK Read/Grep/Glob/Bash tools fire (replacing the previous manual tool dispatch loop).
  • openant generate-context: works with both auth modes.
  • openant report: works with both auth modes.
  • Dependency state: pip show anthropic after pip install -e libs/openant-core/ returns nothing (anthropic dropped). pip show claude-agent-sdk shows >=0.1.48.
  • No anthropic imports left: git grep -E '^import anthropic|^from anthropic' libs/openant-core --include='*.py' -- ':!tests/' returns nothing.
  • Rate-limit handling: when you hit a rate limit, GlobalRateLimiter backs off as before. Surface error class is one of the utilities/sdk_errors.py types.
  • Regression guard: tests/test_declared_dependencies.py would fail CI if a future merge silently re-introduced an undeclared import.
  • Note: depends on PR refactor: centralize model IDs into model_config.py #42 (centralize model IDs); please merge that first.

@joshbouncesecurity
Copy link
Copy Markdown
Contributor Author

Local test results

Reinstalled openant-core from this branch on Windows and ran the full pipeline end-to-end against sample_python_repo (3 files, 5 reachable units).

Commands run:

go build -o openant.exe ./
pip install -e libs/openant-core/

# Dependency state
pip show anthropic              # see note below
pip show claude-agent-sdk       # Version: 0.1.63 (>= 0.1.48) ✓

# No anthropic imports
git grep -nE "^import anthropic|^from anthropic" -- 'libs/openant-core/**/*.py' \
  ':!libs/openant-core/tests/**'
# (no output, exit 1)

# pyproject.toml
grep -E "anthropic|claude.agent.sdk" libs/openant-core/pyproject.toml
# Only: claude-agent-sdk>=0.1.48

# End-to-end
openant parse <sample_python_repo> --output <out> --level reachable    # 5 units
openant analyze <out>/dataset.json --output <out> --limit 2 --model sonnet

Outcome (against the manual-verification checklist):

  • API-key path: analyze ran to completion. Banner printed Using Claude Agent SDK (API key mode). 2 units processed, 3 API calls, ~3,873 tokens, $0.18 reported. No anthropic-related traceback. ✅
  • claude-agent-sdk 0.1.63 installed (≥ 0.1.48 floor) ✅
  • No ^import anthropic or ^from anthropic left in non-test code ✅
  • pyproject.toml no longer declares anthropic; claude-agent-sdk>=0.1.48 is declared ✅
  • [⚠️] pip show anthropic after install: the test plan expected this to return nothing. On my shared venv it still shows anthropic 0.96.0 because pip install -e . doesn't uninstall transitive deps that were dropped from pyproject.toml. The runtime no longer imports it (verified by the empty git grep), so this is purely a leftover-artifact from the previous install. CI checking out a clean venv would not see it. Worth noting to users upgrading: pip uninstall anthropic after pull is advisable. ⚠️
  • Local-session path (OPENANT_LOCAL_CLAUDE=true), no-auth banner, verify/report end-to-end, rate-limit class taxonomy, regression-guard test were not exercised in this manual pass — covered by tests/test_declared_dependencies.py and the existing pytest suite.

One side observation: when the SDK initialised it noticed a corrupted ~/.claude.json from this shell and rotated it to a backup. That's SDK behaviour, unrelated to this PR.

Cost: $0.18 reported (Sonnet, 2 units, 49s).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant