feat: LLM review stage for enhanced reachability detection by joshbouncesecurity · Pull Request #50 · knostic/OpenAnt

joshbouncesecurity · 2026-05-04T18:59:58Z

Summary

Adds a new LLM review stage (off by default, enabled via --llm-reachability on openant scan) that uses a strong model (Opus by default) to surface additional reachability signals beyond what the structural analysis catches:

Likely entry points the structural pass missed (framework-specific handlers, plugin/CLI registrations, message handlers).
External content ingestion (HTTP request bodies, file/network reads, env/argv, stdin, untrusted IPC).
Cross-process or async data flows.

Signals are advisory and only promote a unit's reachability — they never demote one that the structural analysis already kept. This matches the issue's "complements, not replaces" intent.

Output:

llm_reachability.json in the scan dir with the full signal list.
llm_reachability_signals: [...] field on each unit in dataset.json.
High-confidence entry_point signals set is_entry_point: true on the target unit.

Cost & rate-limit safety: reuses the existing GlobalRateLimiter via AnthropicClient, opt-in only, and prompts are batched (default 25 units/call).

Addresses #17 (does not close — let the maintainer review the prompt + heuristics first).

Test plan

Unit tests for analyze_reachability with mocked LLM: fixed JSON, malformed JSON, exception in client, app_context threading, batch chunking.
Unit tests for apply_signals: promote-only semantics (high-confidence promotes, medium does not, never demotes), per-unit signal accumulation, unknown-id rejection.
CLI plumbing tests: --llm-reachability appears in openant scan --help; default does not pass the flag through to scan_repository; setting it threads llm_reachability=True.
Existing pytest suite passes: 112 passed, 23 skipped (env-dependent, e.g. Go binary).
Manual: enable on a small Express fixture; verify route handlers gain entry-point signals.

Adds an opt-in LLM review stage (off by default, enabled via the new `--llm-reachability` flag on `openant scan`) that uses a strong model (Opus by default) to surface additional reachability signals beyond what the structural pass catches: - Likely entry points the structural analysis may miss (framework hooks, plugin/CLI registrations, message handlers). - External-input sites (HTTP request bodies, file/network reads, env/argv, stdin, untrusted IPC). - Cross-process / async data-flow indicators. Signals are advisory and *promote-only*: high-confidence entry-point signals can set `is_entry_point=True` on a unit, but no signal ever demotes a unit that the structural analysis already kept. This matches the "complements, does not replace" intent in issue #17. Output: - `llm_reachability.json` written to the scan dir with the full signal list. - Each unit gains an `llm_reachability_signals` array on the dataset. Cost & rate-limit safety: opt-in only, prompts are batched, and the client integration goes through the existing `AnthropicClient` (which respects `GlobalRateLimiter`). Refs #17.

The Python CLI defines --llm-reachability for the LLM reachability stage (issue #17), but the Go CLI proxy did not expose it. The test TestHelp::test_scan_help_advertises_llm_reachability inspects 'openant scan --help' (Go cobra output) and was failing on all 3 OS targets. Register --llm-reachability as a Bool flag on the Go scan command and pass it through to the Python invocation when set.

- scanner.py: forward-declare app_context_path before step 1.5 so the LLM reachability block doesn't hit a NameError when --llm-reachability is enabled (the block ran before the app-context step that defined it). - llm_reachability._chunk: non-positive batch_size used to reference an unbound loop variable; now collapses to a single batch covering all items. Adds a regression test. - Help text (Python CLI + Go CLI): note that --llm-reachability may incur additional LLM cost, per cost-safety review.

The LLM reachability stage threads app_context into its prompt to help the model reason about expected entry points (web_app vs cli_tool, etc). The previous ordering ran it before app-context generation, so the app_context_path was always None at the call site — the prompt threading silently no-op'd. Reordering the steps makes the threading actually work. This also retires the temporary forward-declaration introduced in the previous commit; app_context_path is now defined naturally by the preceding step before the LLM reachability block reads it.

joshbouncesecurity · 2026-05-04T20:46:38Z

Manual verification

Off by default. Requires API key. Cost note: enabling adds approximately one Opus call per 25 units.

openant scan --help shows --llm-reachability with the cost note.
Default behavior: openant scan <repo> (no flag) — pipeline unchanged from current behavior; no LLM reachability cost incurred.
Enabled: openant scan <repo> --llm-reachability — emits llm_reachability.json in the scan dir; merged signals appear on dataset.json units as llm_reachability_signals.
Promote-only: high-confidence entry_point signals on a previously-unreachable unit promote it to is_entry_point: true. Existing entry points are NEVER demoted.
Pipeline ordering: app-context generation runs BEFORE the LLM reachability stage, so app context is threaded into the reachability prompt. Verify with --quiet step ordering.
Markdown-wrapped JSON: model occasionally returns the response inside a fenced json ... block — the parser strips the fences and recovers the signals.
Empty units: openant scan <empty-or-failed-parse> --llm-reachability: stage skips gracefully (no LLM call).

joshbouncesecurity · 2026-05-04T21:14:26Z

Local test results

Reinstalled openant-core from this branch and ran openant scan --llm-reachability end-to-end on the in-tree sample_python_repo (5 reachable units after the structural pass). Skipped enhance/verify/report/dynamic-test to keep cost minimal.

Commands run:

go build -o openant.exe ./
pip install -e libs/openant-core/

openant scan <sample_python_repo> --output <out> \
  --llm-reachability \
  --no-context --no-enhance --no-report --skip-dynamic-test \
  --limit 1 --model sonnet

Outcome (against the manual-verification checklist):

openant scan --help lists --llm-reachability with the cost note: "Off by default — enabling this may incur additional LLM cost (one Opus call per ~25 units)" ✅
Enabled run: llm_reachability.json written to the scan dir; the new pipeline step llm-reachability ran successfully and emitted llm-reachability.report.json with cost_usd: 0.01245, token_usage.total_tokens: 782, units_reviewed: 5 ✅
Promote-only semantics: Opus reviewed 5 units (the structural pass had already correctly tagged the two Flask endpoints as is_entry_point: True); model returned signals: []. signals_added: 0, entry_points_promoted: 0, units_touched: 0. Existing entry points were not demoted ✅
Pipeline ordered as documented: parse → llm-reachability → analyze. llm-reachability.report.json is timestamped before analyze.report.json ✅
Reachability stage runs Opus (per the report) regardless of --model sonnet (which only governs analyze) — matches cost note ✅
Did not exercise the markdown-fence recovery path or the empty-units skip path here — covered by the unit tests in the diff. Did not separately confirm app-context-threading because I passed --no-context.

Total cost: $0.024 (reachability $0.012 + analyze 1 unit $0.012). Well under the budget.

joshbouncesecurity added 4 commits May 4, 2026 21:59

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: LLM review stage for enhanced reachability detection#50

feat: LLM review stage for enhanced reachability detection#50
joshbouncesecurity wants to merge 4 commits intoknostic:masterfrom
joshbouncesecurity:feat/issue17-llm-reachability

joshbouncesecurity commented May 4, 2026

Uh oh!

joshbouncesecurity commented May 4, 2026 •

edited

Loading

Uh oh!

joshbouncesecurity commented May 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

joshbouncesecurity commented May 4, 2026

Summary

Test plan

Uh oh!

joshbouncesecurity commented May 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Manual verification

Uh oh!

joshbouncesecurity commented May 4, 2026

Local test results

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

joshbouncesecurity commented May 4, 2026 •

edited

Loading