Skip to content

feat: LLM review stage for enhanced reachability detection#50

Draft
joshbouncesecurity wants to merge 4 commits intoknostic:masterfrom
joshbouncesecurity:feat/issue17-llm-reachability
Draft

feat: LLM review stage for enhanced reachability detection#50
joshbouncesecurity wants to merge 4 commits intoknostic:masterfrom
joshbouncesecurity:feat/issue17-llm-reachability

Conversation

@joshbouncesecurity
Copy link
Copy Markdown
Contributor

Summary

Adds a new LLM review stage (off by default, enabled via --llm-reachability on openant scan) that uses a strong model (Opus by default) to surface additional reachability signals beyond what the structural analysis catches:

  • Likely entry points the structural pass missed (framework-specific handlers, plugin/CLI registrations, message handlers).
  • External content ingestion (HTTP request bodies, file/network reads, env/argv, stdin, untrusted IPC).
  • Cross-process or async data flows.

Signals are advisory and only promote a unit's reachability — they never demote one that the structural analysis already kept. This matches the issue's "complements, not replaces" intent.

Output:

  • llm_reachability.json in the scan dir with the full signal list.
  • llm_reachability_signals: [...] field on each unit in dataset.json.
  • High-confidence entry_point signals set is_entry_point: true on the target unit.

Cost & rate-limit safety: reuses the existing GlobalRateLimiter via AnthropicClient, opt-in only, and prompts are batched (default 25 units/call).

Addresses #17 (does not close — let the maintainer review the prompt + heuristics first).

Test plan

  • Unit tests for analyze_reachability with mocked LLM: fixed JSON, malformed JSON, exception in client, app_context threading, batch chunking.
  • Unit tests for apply_signals: promote-only semantics (high-confidence promotes, medium does not, never demotes), per-unit signal accumulation, unknown-id rejection.
  • CLI plumbing tests: --llm-reachability appears in openant scan --help; default does not pass the flag through to scan_repository; setting it threads llm_reachability=True.
  • Existing pytest suite passes: 112 passed, 23 skipped (env-dependent, e.g. Go binary).
  • Manual: enable on a small Express fixture; verify route handlers gain entry-point signals.

Adds an opt-in LLM review stage (off by default, enabled via the new
`--llm-reachability` flag on `openant scan`) that uses a strong model
(Opus by default) to surface additional reachability signals beyond
what the structural pass catches:

- Likely entry points the structural analysis may miss (framework
  hooks, plugin/CLI registrations, message handlers).
- External-input sites (HTTP request bodies, file/network reads,
  env/argv, stdin, untrusted IPC).
- Cross-process / async data-flow indicators.

Signals are advisory and *promote-only*: high-confidence entry-point
signals can set `is_entry_point=True` on a unit, but no signal ever
demotes a unit that the structural analysis already kept. This matches
the "complements, does not replace" intent in issue #17.

Output:
- `llm_reachability.json` written to the scan dir with the full signal
  list.
- Each unit gains an `llm_reachability_signals` array on the dataset.

Cost & rate-limit safety: opt-in only, prompts are batched, and the
client integration goes through the existing `AnthropicClient` (which
respects `GlobalRateLimiter`).

Refs #17.
The Python CLI defines --llm-reachability for the LLM reachability stage
(issue #17), but the Go CLI proxy did not expose it. The test
TestHelp::test_scan_help_advertises_llm_reachability inspects 'openant
scan --help' (Go cobra output) and was failing on all 3 OS targets.

Register --llm-reachability as a Bool flag on the Go scan command and
pass it through to the Python invocation when set.
- scanner.py: forward-declare app_context_path before step 1.5 so the
  LLM reachability block doesn't hit a NameError when --llm-reachability
  is enabled (the block ran before the app-context step that defined it).
- llm_reachability._chunk: non-positive batch_size used to reference an
  unbound loop variable; now collapses to a single batch covering all
  items. Adds a regression test.
- Help text (Python CLI + Go CLI): note that --llm-reachability may
  incur additional LLM cost, per cost-safety review.
The LLM reachability stage threads app_context into its prompt to help
the model reason about expected entry points (web_app vs cli_tool, etc).
The previous ordering ran it before app-context generation, so the
app_context_path was always None at the call site — the prompt threading
silently no-op'd. Reordering the steps makes the threading actually work.

This also retires the temporary forward-declaration introduced in the
previous commit; app_context_path is now defined naturally by the
preceding step before the LLM reachability block reads it.
@joshbouncesecurity
Copy link
Copy Markdown
Contributor Author

joshbouncesecurity commented May 4, 2026

Manual verification

Off by default. Requires API key. Cost note: enabling adds approximately one Opus call per 25 units.

  • openant scan --help shows --llm-reachability with the cost note.
  • Default behavior: openant scan <repo> (no flag) — pipeline unchanged from current behavior; no LLM reachability cost incurred.
  • Enabled: openant scan <repo> --llm-reachability — emits llm_reachability.json in the scan dir; merged signals appear on dataset.json units as llm_reachability_signals.
  • Promote-only: high-confidence entry_point signals on a previously-unreachable unit promote it to is_entry_point: true. Existing entry points are NEVER demoted.
  • Pipeline ordering: app-context generation runs BEFORE the LLM reachability stage, so app context is threaded into the reachability prompt. Verify with --quiet step ordering.
  • Markdown-wrapped JSON: model occasionally returns the response inside a fenced json ... block — the parser strips the fences and recovers the signals.
  • Empty units: openant scan <empty-or-failed-parse> --llm-reachability: stage skips gracefully (no LLM call).

@joshbouncesecurity
Copy link
Copy Markdown
Contributor Author

Local test results

Reinstalled openant-core from this branch and ran openant scan --llm-reachability end-to-end on the in-tree sample_python_repo (5 reachable units after the structural pass). Skipped enhance/verify/report/dynamic-test to keep cost minimal.

Commands run:

go build -o openant.exe ./
pip install -e libs/openant-core/

openant scan <sample_python_repo> --output <out> \
  --llm-reachability \
  --no-context --no-enhance --no-report --skip-dynamic-test \
  --limit 1 --model sonnet

Outcome (against the manual-verification checklist):

  • openant scan --help lists --llm-reachability with the cost note: "Off by default — enabling this may incur additional LLM cost (one Opus call per ~25 units)" ✅
  • Enabled run: llm_reachability.json written to the scan dir; the new pipeline step llm-reachability ran successfully and emitted llm-reachability.report.json with cost_usd: 0.01245, token_usage.total_tokens: 782, units_reviewed: 5
  • Promote-only semantics: Opus reviewed 5 units (the structural pass had already correctly tagged the two Flask endpoints as is_entry_point: True); model returned signals: []. signals_added: 0, entry_points_promoted: 0, units_touched: 0. Existing entry points were not demoted ✅
  • Pipeline ordered as documented: parse → llm-reachability → analyze. llm-reachability.report.json is timestamped before analyze.report.json
  • Reachability stage runs Opus (per the report) regardless of --model sonnet (which only governs analyze) — matches cost note ✅
  • Did not exercise the markdown-fence recovery path or the empty-units skip path here — covered by the unit tests in the diff. Did not separately confirm app-context-threading because I passed --no-context.

Total cost: $0.024 (reachability $0.012 + analyze 1 unit $0.012). Well under the budget.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant