-
Notifications
You must be signed in to change notification settings - Fork 2k
[ENH] Eventual consistency in query nodes #5945
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Reviewer ChecklistPlease leverage this checklist to ensure your code review is thorough before approving Testing, Bugs, Errors, Logs, Documentation
System Compatibility
Quality
|
This stack of pull requests is managed by Graphite. Learn more about stacking. |
542b0a4 to
a4bcf3a
Compare
|
Add query Introduces a Key Changes• Defined Affected Areas• This summary was automatically generated by @propel-code-bot |
| collection_and_segments, | ||
| }, | ||
| payloads: request.searches, | ||
| read_level: ReadLevel::IndexAndWal, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[Requirements] read_level is hardcoded to IndexAndWal here. This prevents clients from utilizing the new IndexOnly functionality even if the backend supports it.
If request (of type SearchRequest) exposes a read_level preference, it should be mapped here. If SearchRequest does not yet support this field, this limitation should be tracked.
Context for Agents
`read_level` is hardcoded to `IndexAndWal` here. This prevents clients from utilizing the new `IndexOnly` functionality even if the backend supports it.
If `request` (of type `SearchRequest`) exposes a `read_level` preference, it should be mapped here. If `SearchRequest` does not yet support this field, this limitation should be tracked.
File: rust/frontend/src/impls/service_based_frontend.rs
Line: 1837There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is deliberate and handled up-stack
| match self.read_level { | ||
| ReadLevel::IndexOnly => { | ||
| // For IndexOnly queries, skip log fetching and use empty logs | ||
| tracing::info!("Skipping log fetch for IndexOnly read level"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[Maintainability] Using tracing::info! in the query execution path will generate excessive logs in high-throughput scenarios (one log per query). Consider using tracing::debug! or tracing::trace! instead to reduce noise.
Context for Agents
Using `tracing::info!` in the query execution path will generate excessive logs in high-throughput scenarios (one log per query). Consider using `tracing::debug!` or `tracing::trace!` instead to reduce noise.
File: rust/worker/src/execution/orchestration/knn_filter.rs
Line: 302
rust/worker/src/server.rs
Outdated
| .scan | ||
| .ok_or(Status::invalid_argument("Invalid Scan Operator"))?; | ||
| let read_level: ReadLevel = chroma_proto::ReadLevel::try_from(search_plan.read_level) | ||
| .unwrap_or(chroma_proto::ReadLevel::IndexAndWal) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
^
|
|
||
| // ReadLevel specifies which data sources to read from during queries. | ||
| // This affects consistency vs performance tradeoffs. | ||
| enum ReadLevel { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: ReadLevel feels unclear to me, is there a use case for this other than toggling whether to read logs or not
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was "eventual_consistency" before. I went back and forth on many names. I think its the clearest one I can generate that is clear, specific, and correct in its use of terms.
Databases often offer "levels" to a read but we are devoid of any formal naming here, so I prefer to be explicit in IndexOnly vs IndexAndWal. Then the question is what to call those options. If its "ConsistencyLevel" I'd expect formal notions of consistency not wishy-washy ones like what we are offering. The reason to make it an enum is we can offer BoundedConsistency in the future.
Sicheng-Pan
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lgtm
352c0f5 to
691f8b1
Compare

Description of changes
Summarize the changes made by this PR.
Test plan
How are these changes tested?
Tests will be handled e2e at the top of the stack in client changes
pytestfor python,yarn testfor js,cargo testfor rustMigration plan
This param will default to IndexAndWal to preserve existing behavior.
Observability plan
Added trace to log this is being used. In conjuction with #6077 this should be sufficient.
Documentation Changes
Updated relevant code docs