filter#154
Conversation
VectorDB Benchmark - Ready To Run
Post one of the command below. Only members with write access can trigger runs. Available Modes
Infrastructure
Both servers start on demand and are always terminated after the run — pass or fail. How Correctness Benchmarking Works
|
VectorDB Benchmark — — FailedTriggered by @shaleenji · Commit ``
|
|
1M indexing time on 8CPU, 30GB machine ovh B3-32 machine int filter: 981 seconds |
|
The above code has two purposes:
1M indexing time on 8CPU, 30GB machine ovh B3-32 machine int filter: 873 seconds (~11% reduction in indexing time) |
VectorDB Benchmark - Ready To Run
Post one of the command below. Only members with write access can trigger runs. Available Modes
Infrastructure
Both servers start on demand and are always terminated after the run — pass or fail. How Correctness Benchmarking Works
|
|
Things todo: |
|
Filter writes are not atomic with vector/meta/HNSW/WAL writes. A failure can leave vector metadata, filter indexes, sparse storage, and HNSW out of sync. |
When >65,536 ids shared the same numeric filter value, Bucket::serialize The fix has four parts in src/filter/numeric_index.hpp:
Editing any header transitively included by ndd.hpp (filter.hpp, |
|
Adjacent issue not addressed here: the slide-split LEFT-bucket rebuild |
|
THIS IS A Breaking change. Need to reindex |
|
|
|
Requires reindexing |
…y checks ids.empty() in numeric_index.cpp (line 306), removal deletes the bucket on that basis in numeric_index.cpp (line 412), and range skips ids.empty() buckets in numeric_index.cpp (line 1004). Also, split rebuilds the left bitmap only from ids in numeric_index.cpp (line 626), dropping bitmap-only duplicate IDs.






No description provided.