Skip to content

Add v2 page-memory ingestion API#196

Merged
suguanYang merged 3 commits into
mainfrom
feat/wangbinqi/knowhere-v2-page-memory-api
Jul 2, 2026
Merged

Add v2 page-memory ingestion API#196
suguanYang merged 3 commits into
mainfrom
feat/wangbinqi/knowhere-v2-page-memory-api

Conversation

@suguanYang

Copy link
Copy Markdown
Contributor

Summary

  • Adds the /v2 API registry for jobs, documents, and retrieval.
  • Routes /v1/jobs through legacy chunk ingestion and /v2/jobs through server-selected ingestion commands.
  • Uses page-memory for v2 PDF/PPTX jobs, while accepting other supported v2 file types on the legacy chunk path for v1-compatible behavior.
  • Persists canonical job metadata for api_version, resolved parse_track, processing_generation, and page-memory worker config.
  • Moves page-memory runtime config out of public request/env switches and into server defaults captured in job metadata.

Closes #194

Tests

  • uv run --all-packages pytest apps/api/tests/contract/test_page_memory_parse_track_contract.py apps/api/tests/contract/test_job_creation_contract.py
  • uv run --all-packages pytest apps/worker/tests/contract/test_page_memory_asset_java_contract.py apps/worker/tests/contract/test_page_memory_fine_hierarchy_contract.py apps/worker/tests/contract/test_page_memory_node_assembler_contract.py apps/worker/tests/contract/test_document_agent_budget_contract.py
  • make lint
  • make typecheck
  • git diff --check

Local smoke test

  • Started local API and worker.
  • Created /v2/jobs PDF job job_839fb1c5b274; worker ran parse_track=page_memory; job reached done; /v2/retrieval/query returned a page result.
  • Created /v2/jobs TXT job job_a90b35ae17e3; worker ran parse_track=chunk; job reached done; /v2/retrieval/query returned a text result.
  • Verified /v2/jobs rejects public parse_track and /v1/jobs rejects parse_track=page_memory.

@suguanYang suguanYang merged commit 3dddcd8 into main Jul 2, 2026
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Design and implement Knowhere v2 server API for page-memory ingestion

1 participant