Skip to content

Design and implement Knowhere v2 server API for page-memory ingestion #194

Description

@suguanYang

What to build

Create a server-side Knowhere v2 API surface where the API version itself selects the ingestion generation.

There must be no public request parameter for clients to choose parsing mode. V1 and v2 should be separated at the route/service layer so the new page-memory/VLM process does not interleave with v1 logic.

Server scope is limited to jobs, documents, and retrieval. MCP, SDK, and CLI changes are out of scope for this issue.

Core behavior:

  • Add /v2 route registry for jobs, documents, and retrieval.
  • Keep /v1/jobs on the legacy text/chunk ingestion path.
  • Make POST /v2/jobs use the v2 ingestion path directly. The client sends normal job creation fields only; no processing, mode, or parse_track field is exposed in v2.
  • Keep v1 and v2 ingestion logic separated by adapters/services, not by a shared client-selected mode switch.
  • Persist canonical job metadata that records the API version and resolved internal processing generation for observability and worker dispatch.
  • Remove page-memory-specific env controls; v2 defaults are server-side configuration/defaults captured into job metadata for the worker.

Implementation design:

  • Add apps/api/app/api/v2/api_v2.py and v2 route modules for jobs, documents, and retrieval.
  • Register /v2 in the top-level API router alongside /v1.
  • Add a v2 job creation schema that mirrors normal job creation input but excludes v1 mode fields: no parse_track, no processing, and no public page-memory tuning fields.
  • Add an internal ingestion command/adaptor seam:
    • v1 adapter builds a legacy chunk/text ingestion command.
    • v2 adapter builds a v2/page-memory ingestion command.
    • shared creation/storage/metadata primitives remain reused below that seam.
  • Persist canonical job metadata fields such as api_version, parse_track or internal processing generation, original_request, source metadata, and resolved worker config.
  • Extend job metadata helpers to read the resolved internal v2 config.
  • Add worker-side page-memory config dataclasses and pass them through PageMemoryInput instead of reading PAGE_MEMORY_* env vars.
  • Remove RETRIEVAL_PAGE_MEMORY_ENABLED from settings, env examples, validation, and tests.
  • Update auth/rate-limit/admission route allowlists for /v2/jobs, /v2/documents, and /v2/retrieval/query.

Acceptance criteria

  • /v2/jobs, /v2/documents, and /v2/retrieval are registered and visible in OpenAPI.
  • POST /v2/jobs accepts normal job creation input without processing, mode, or parse_track.
  • POST /v1/jobs remains on the legacy chunk/text path.
  • POST /v2/jobs uses the v2 ingestion path without requiring or accepting a client-selected mode.
  • v1 and v2 ingestion are separated at route/adapter/service level, not interleaved through a shared request mode switch.
  • v2-created jobs persist API version and resolved internal processing metadata for worker dispatch and debugging.
  • Worker page-memory execution reads resolved config from job metadata/defaults, not env vars.
  • RETRIEVAL_PAGE_MEMORY_ENABLED and os.environ.get("PAGE_MEMORY...") usages are removed.
  • /v2/jobs read/list/confirm-upload, /v2/documents/*, and /v2/retrieval/query preserve compatible response contracts.
  • Contract tests cover v1 compatibility, v2 route behavior, v2 aliases, worker metadata config, and static env cleanup.
  • One integration test proves a job created through /v2/jobs follows the v2 ingestion path and can be queried through /v2/retrieval/query.

Blocked by

None - can start immediately.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions