refactor: migrate all LLM calls to Claude Agent SDK#51
refactor: migrate all LLM calls to Claude Agent SDK#51joshbouncesecurity wants to merge 13 commits intoknostic:masterfrom
Conversation
Adds libs/openant-core/utilities/model_config.py exposing MODEL_PRIMARY,
MODEL_AUXILIARY, and MODEL_DEFAULT, and replaces hardcoded
claude-opus-* / claude-sonnet-* string literals across the codebase
with imports from that module. Future model bumps become one-line.
Defaults match the canonical strings already in upstream
(claude-opus-4-20250514 and claude-sonnet-4-20250514). The
claude-opus-4-6 alias call sites are unified to MODEL_PRIMARY; both
forms route to the same Claude Opus 4 model on the Anthropic API
(the upstream MODEL_PRICING table already mapped both keys to the
same prices), so behavior is unchanged.
Adds tests:
- tests/test_model_config.py asserts the constants exist, are
non-empty strings, and match the expected claude-(opus|sonnet|haiku)-...
model-id format.
- A regression test scans every libs/openant-core/**/*.py file and
fails if any hardcoded claude-opus-*/claude-sonnet-* literal
reappears outside model_config.py itself.
Addresses item 15 from #16 (does not close).
) PR #25 claimed to remove `anthropic` from pyproject.toml but left four files still using it (`utilities/context_enhancer.py`, `utilities/finding_verifier.py`, `utilities/agentic_enhancer/agent.py`, `report/generator.py`). Every clean install fails at `from utilities ...` because `utilities/__init__.py` eagerly loads `context_enhancer`. The upstream merge also added a Zig parser that imports `tree_sitter_zig` without declaring it. - Re-declare `anthropic>=0.40.0` and add `tree-sitter-zig>=0.20.0` to pyproject.toml so the declared deps match actual imports. - Delete requirements.txt — it was a hand-maintained duplicate of pyproject.toml's deps, and the drift is exactly what let #25 slip through CI. Single source of truth now. - Update CI to install via `pip install -e ".[dev]"` so pyproject.toml is exercised on every run. - Add `tests/test_declared_dependencies.py`: a static check that every third-party import under the packaged dirs maps to a declared dependency, plus a smoke test that each packaged top-level module imports cleanly. Catches the regression class directly. - Update README install instructions to match. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> (cherry picked from commit c11cd50)
feat: add sdk_errors taxonomy for SDK-surfaced LLM errors (cherry picked from commit 1fe913e)
feat: surface SDK API errors as typed sdk_errors exceptions (cherry picked from commit 0ee9af8)
…tor-sdk refactor: port report/generator.py to Claude Agent SDK (cherry picked from commit 0b5675d)
…r-sdk refactor: port context_enhancer error classifier to sdk_errors (cherry picked from commit 27e5481)
…native refactor: port finding_verifier.py to Claude Agent SDK native tools (cherry picked from commit 09073d7)
…t-sdk refactor: port agentic_enhancer to Claude Agent SDK native tools (cherry picked from commit b73c618)
chore: drop anthropic dependency — SDK migration complete (cherry picked from commit 67376c9)
The fork's PR chain (#30-#38) ported the main LLM call sites but left upstream-only callers (application_context, generate_report, stage1_consistency) on the legacy anthropic SDK. Once #38 drops the anthropic dependency these import paths fail, so finish the migration here. - llm_client.AnthropicClient: re-implement on top of the SDK helpers (_run_query_sync / _build_options) so the legacy class name keeps working without re-importing anthropic. Adds optional cost_usd to TokenTracker.record_call so SDK-reported total_cost_usd flows through. - context/application_context.py: Anthropic() -> AnthropicClient. - generate_report.py: anthropic.Anthropic -> AnthropicClient. - utilities/stage1_consistency.py: client.messages.create -> client.analyze_sync (callers already pass an AnthropicClient). - Tests: update test_silent_401 to mock _run_query_sync raising sdk_errors.AuthError, and test_disclosure_source_fidelity to patch generator.AnthropicClient instead of the removed anthropic.Anthropic. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- prompts/verification_prompts.py: add VERIFICATION_JSON_SCHEMA and get_native_claude_verification_prompt — required by finding_verifier's run_native_verification call. Upstream had only the legacy custom-tool prompt; the SDK port needs the structured-output prompt + schema. - tests/test_local_claude.py: replace hardcoded claude-opus/sonnet literals with MODEL_PRIMARY/MODEL_AUXILIARY imports so the test_no_hardcoded_model_strings_outside_model_config regression test (from item 15) passes. - utilities/llm_client.py: add TokenTracker.restore_from for checkpoint resume, and replace hardcoded model literal in docstring. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Previous review (click to expand)🤖 Automated Claude Code Review |
🤖 Automated Claude Code ReviewContext: This is a follow-up (Round 2) review covering commit FindingsF1 — In elif api_key:
print("Using Claude Agent SDK (API key mode)", file=sys.stderr)
else:
print("Using Claude Agent SDK (local session)", file=sys.stderr)When neither F2 — The docstring in F3 — The updated rate_limiter.reset_rate_limiter() # resets internal state but keeps _rate_limiter alive
rate_limiter._rate_limiter = None # nulls the module-level handle
rate_limiter.GlobalRateLimiter._instance = None # nulls the class-level Borg singletonThe F4 — In try:
from claude_agent_sdk import (
CLIConnectionError, CLIJSONDecodeError, CLINotFoundError, ProcessError,
)
if isinstance(exc, CLIConnectionError): ...
except ImportError:
passThis is called on every exception (potentially hundreds per run under parallel workers). F5 —
F6 — In get_rate_limiter().wait_if_needed()
try:
from claude_agent_sdk import ClaudeSDKError
except ImportError:
ClaudeSDKError = ()The F7 — In if agree is None:
agree = correct_finding == original_findingIf the free-text response doesn't contain "agree" or "disagree" but does contain e.g. "VULNERABLE", and the original finding was also "vulnerable", F8 — In json_block = re.search(r'```(?:json)?\s*(\{.*?\})\s*```', text, re.DOTALL)
{
"agree": true,
"correct_finding": "safe",
"explanation": "nested {} braces here"
}The non-greedy F9 — In repo_path = str(self.index.repo_path) if self.index.repo_path else None
options = _build_options(
...
add_dirs=[repo_path] if repo_path else [],
cwd=repo_path,
...
)When F10 — In self.tracker.record_call(
model=VERIFIER_MODEL,
input_tokens=input_tokens,
output_tokens=output_tokens,
cost_usd=result.get("cost_usd"),
)But F11 —
F12 — The quick-start in F13 — In the remediation-guidance block of Positive Notes
|
…fication Round 2 of iterative review on PR knostic#51: - llm_client._log_auth_mode: when neither ANTHROPIC_API_KEY nor OPENANT_LOCAL_CLAUDE=true is set, print a Warning explaining the SDK will rely on the local claude CLI's own auth state instead of falsely claiming "local session" mode. - context_enhancer._build_error_info: drop the CLINotFoundError -> auth mapping. Missing claude binary is an environmental issue, not an API auth failure - leaving type as unknown keeps the diagnostic accurate while exception_class still pinpoints the cause. - tests/test_declared_dependencies DIST_TO_IMPORT: add tree-sitter-zig entry for consistency with sibling tree-sitter-* entries (the fallback hyphen-to-underscore conversion already produced the correct name, but explicit listing matches convention). Tests: 149 passed, 22 skipped.
Manual verificationLargest PR — full LLM call surface migrated. Requires either
|
Local test resultsReinstalled openant-core from this branch on Windows and ran the full pipeline end-to-end against Commands run: Outcome (against the manual-verification checklist):
One side observation: when the SDK initialised it noticed a corrupted Cost: $0.18 reported (Sonnet, 2 units, 49s). |
Summary
Routes all LLM calls (analyze, enhance, verify, report, context, remediation) through the Claude Agent SDK (
claude-agent-sdk) instead of theanthropicAPI.Benefits:
Read/Grep/Glob/Bashare SDK-native; the manual multi-turn tool dispatch loop infinding_verifier.pyis replaced byrun_native_verification.ResultMessage.total_cost_usd.AssistantMessage.error(Literal:rate_limit/authentication_failed/billing_error/invalid_request/server_error/unknown) is mapped to a typedutilities/sdk_errors.pytaxonomy and centrally reported through the existingGlobalRateLimiter.Dependency change: drops
anthropic>=0.40.0, addsclaude-agent-sdk>=0.1.48. The newtests/test_declared_dependencies.pyregression-guard fails CI if any imported distribution isn't declared inpyproject.toml, preventing silent re-introduction on future merges.Addresses item 16 from #16 (does not close the issue).
Commits in this PR
The migration is structured as 10 reviewable commits cherry-picked from the fork's incremental ports. Each compiles in isolation;
anthropicstays declared inpyproject.tomluntil commit 8 (PR #38) so intermediate states still build.test_declared_dependencies.py).utilities/sdk_errors.pytaxonomy.AssistantMessage.errorsurfacing intollm_client._run_query, central rate-limit reporting throughGlobalRateLimiter. (Also introduces the SDK helper functions:_build_env,_build_options,_run_query,_run_query_sync,run_native_verification.)report/generator.py(single-turn, smallest site).context_enhancer._build_error_infoisinstance chain tosdk_errors.*.finding_verifier.py(manual tool loop ->run_native_verificationwith SDK native tools).agentic_enhancer/agent.py; absorbshared_clientremoval incontext_enhancer.py.anthropic>=0.40.0frompyproject.toml; clean upopenant/cli.pyremediation-guidance call.application_context.py,generate_report.py,stage1_consistency.py. Re-implementAnthropicClienton top of the SDK helpers so the legacy class name keeps working.VERIFICATION_JSON_SCHEMA,get_native_claude_verification_prompt),TokenTracker.restore_from, and centralize remaining model literals.Test plan
tests/test_declared_dependencies.pypasses locally (8 tests).git grep "^import anthropic\|^from anthropic" -- '*.py' ':!tests/'is empty.tests/test_go_cli.pyexcluded — needs Go binary).openant analyzeagainst a small fixture withANTHROPIC_API_KEYexported — runs to completion, cost reported, noanthropictraceback.openant analyzewith no API key but logged into Claude Code locally — uses the local session.