Skip to content

docs: bulk reprocess documentation + fix field extraction truncation description (#289)#294

Merged
duguankui merged 1 commit into
mainfrom
docs/289-reprocessing
Jun 10, 2026
Merged

docs: bulk reprocess documentation + fix field extraction truncation description (#289)#294
duguankui merged 1 commit into
mainfrom
docs/289-reprocessing

Conversation

@duguankui

@duguankui duguankui commented Jun 10, 2026

Copy link
Copy Markdown
Member

Review and complete documentation related to the bulk reprocess work in #289.

Changes

  • Add docs/reprocessing.md: Fully covers the bulk reprocessing feature merged into main
    • Three entry points: bulk field re-extraction (leaf / light warning), bulk reclassification (cascading / destructive / heavy warning), single-document "re-extract fields only"
    • Reclassification scope (specific type only / all documents across types / pending-review queue) + default protection of manually confirmed documents
    • Lifecycle-neutral field-extraction pipeline, chained dispatcher + keyset + idempotency mechanism
    • Permissions, REST endpoints, throughput (host background job manager; concurrency is a deployment-layer concern), out-of-scope (no re-OCR)
  • Fix docs/ai-provider.md: MaxTextLengthPerExtraction only truncates classification + cabinet selection input; field extraction feeds the full document without truncation (code-verified: FieldExtractionWorkflow does not read this configuration). The original text incorrectly stated it also truncates field extraction.
  • Cross-links: classification.md "See also" + README documentation index now point to reprocessing.md.

Pure documentation change; no code touched.

- docs/reprocessing.md:批量字段重抽 / 重新分类(范围 + 保护人工确认)/ 单篇仅重抽字段、
  生命周期中性 field-extraction pipeline、链式分发机制、权限、REST 端点、out-of-scope
- ai-provider.md 修正:MaxTextLengthPerExtraction 只截断分类 + 选柜,字段抽取喂全文不截断
- classification.md「See also」+ README 文档索引 交叉链接 reprocessing.md

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@duguankui duguankui merged commit b6315f6 into main Jun 10, 2026
2 checks passed
@duguankui duguankui deleted the docs/289-reprocessing branch June 10, 2026 00:05
@duguankui duguankui changed the title docs: 批量重处理文档 + 修正字段抽取截断说明 (#289) docs: bulk reprocess documentation + fix field extraction truncation description (#289) Jun 12, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant