Skip to content

feat: kb.triage_pending — advisory scoring on pending proposals for reviewers #322

Description

@plind-junior

the pending-review queue is the human bottleneck. a reviewer facing a long kb.list_pending has to reconstruct, per proposal, whether the claim fits what's already approved, whether its evidence is real and load-bearing, whether it duplicates an existing claim, and whether it contradicts one. those signals already exist in scattered form — propose_claim computes similarity warnings (find_similar_on_propose, issue #147), the payload gates in proposals._payload_block_reason know about dangling refs — but nothing surfaces them together as a ranked, explained view for the reviewer.

this proposes an optional triage pass that scores each pending proposal on fit, citation quality, duplication risk, and contradiction risk, then attaches an advisory recommendation plus a short rationale to the proposal's view. the score is advisory only: a human still calls kb.approve / kb.reject, and the pass never decides, writes an approved artifact, or moves a proposal out of pending.

proposed surface

new read-side method kb.triage_pending that takes optional proposal_ids (default: all pending) and returns, per proposal, the existing model_dump plus a _meta.vouch_triage block:

  • recommendation: one of approve / reject / needs-human (advisory string; never actioned)
  • score: 0.0–1.0 confidence in the recommendation
  • signals: {fit, citation_quality, duplication_risk, contradiction_risk}, each a scored sub-result reusing the existing embedding similarity path (embeddings.similarity.find_similar_on_propose) and the ref checks in proposals._payload_block_reason
  • rationale: short lowercase prose the reviewer reads before deciding

cli mirror:

vouch triage                    # score all pending, print ranked table
vouch triage <proposal-id>...   # score a subset
vouch triage --json             # machine-readable _meta.vouch_triage blocks
vouch triage --sort score       # worst-first or best-first ordering

config under .vouch/config.yaml (typed via #243 when it lands):

triage:
  enabled: false          # opt-in; off by default
  backend: embeddings     # scoring backend; degrades to heuristic if extra absent
  weights:                # per-signal weights into the composite score
    fit: 0.3
    citation_quality: 0.3
    duplication_risk: 0.2
    contradiction_risk: 0.2

as a new kb.* method, kb.triage_pending must touch the four registration sites — @mcp.tool() in src/vouch/server.py, _h_triage_pending + HANDLERS["kb.triage_pending"] in src/vouch/jsonl_server.py, METHODS in src/vouch/capabilities.py, the cli command in src/vouch/cli.py — plus tests/test_triage.py.

review gate & scope

the review gate stays load-bearing. triage is a read-side pass: it computes over pending proposals and attaches advisory metadata, and it does not call proposals.approve, proposals.reject, store.put_*, or store.move_proposal_to_decided. the recommendation is a hint on the proposal's view; a human still issues kb.approve / kb.reject. if run on a schedule or in the background it only reads and annotates — it never decides, auto-approves, or writes an approved artifact.

this is explicitly distinct from #162 (review-gate policy engine: rule-based conditional auto-approve / block / escalation). #162 can take an action that bypasses a human for matching rules; this issue takes no action at all. it also differs from #147 (propose-time similarity warnings on a single claim as it's filed): triage runs over the whole pending queue at review time and folds duplication into a broader four-signal composite alongside fit, citation quality, and contradiction risk.

scoring logic lives in a new module beside the read tools (as salience.py does); storage.py stays pure i/o. the pass runs fully local against the on-disk kb and the derived state.db index — no network call, no external service, no change to the yaml storage format.

acceptance criteria

  • kb.triage_pending returns each pending proposal's model_dump with a _meta.vouch_triage block (recommendation, score, signals, rationale)
  • the pass never calls proposals.approve / proposals.reject / store.move_proposal_to_decided / store.put_*; a test asserts no proposal leaves pending after a triage run
  • recommendation is advisory string-only and is not consumed by any decision path
  • duplication and fit signals reuse the existing embedding similarity path; base install (no embeddings extra) degrades to a heuristic and still returns a block
  • triage.enabled: false is the default; the method is opt-in
  • all four registration sites present and test_capabilities stays green
  • vouch triage prints a ranked table with --json and --sort flags
  • tests/test_triage.py covers scoring output shape, the no-write invariant, and the embeddings-absent fallback

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requestretrievalcontext, search, synthesis, and evaluationsize: L500-999 changed non-doc lines

    Type

    No type

    Fields

    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions