Add prefix filter support by zaidoon1 · Pull Request #186 · fjall-rs/lsm-tree

zaidoon1 · 2025-11-03T19:51:39Z

rebased #151 on top of main and adapted various things to the new api

marvin-j97 · 2026-01-02T15:45:07Z

Unfortunately this has been hit by another wave of conflicts, but I just released 3.0.0, so there will be a bit of a freeze of activity from this point on.

zaidoon1 · 2026-01-03T06:57:34Z

no worries! let me know when you want me to restart working on this/when you are ready to merge things in again. Also feel free to ping me on any tickets/features, etc.. happy to help with whatever (lsm tree or fjall itself)

marvin-j97 · 2026-02-09T23:06:38Z

At this point 3.0 has stabilized I think. I'm definitely keen on getting prefix extractors and compaction filters in as the next major features.

codecov · 2026-02-10T20:33:14Z

Codecov Report

❌ Patch coverage is 95.46926% with 28 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
src/run_reader.rs	93.02%	12 Missing ⚠️
src/table/mod.rs	92.62%	9 Missing ⚠️
src/table/writer/filter/partitioned.rs	73.68%	5 Missing ⚠️
src/blob_tree/mod.rs	96.96%	1 Missing ⚠️
src/range.rs	95.00%	1 Missing ⚠️

📢 Thoughts on this report? Let us know!

marvin-j97 · 2026-02-11T13:43:09Z

I will soon do a more in-depth look into this PR but in the mean time: the run_reader logic is mostly not covered by tests, so I think there are still edge cases that are missing in tests. Other files are not as affected or even improve in coverage, so that's good.

zaidoon1 · 2026-02-11T16:55:02Z

sounds good, i'll add more tests to cover the run_reader logic

zaidoon1 · 2026-02-11T18:19:20Z

note there is some false positives like: https://app.codecov.io/gh/fjall-rs/lsm-tree/pull/186#644ae531cb268487817af88f68673c70-R56 where the doc comments are showing up as "untested"

marvin-j97 · 2026-02-12T18:57:05Z

note there is some false positives like: https://app.codecov.io/gh/fjall-rs/lsm-tree/pull/186#644ae531cb268487817af88f68673c70-R56 where the doc comments are showing up as "untested"

That makes sense because the extractors are never actually asserted to work correctly.

Adding something like

assert_eq!(..., SegmentedPrefixExtractor.name());

assert!(..., SegmentedPrefixExtractor.extract(...));

should fix it.

zaidoon1 · 2026-02-12T22:11:46Z

sounds good, i'll add that but i'll wait for your other feedback on this PR and i address it all in one go.

src/config/mod.rs

marvin-j97 · 2026-02-12T22:40:18Z

tests/prefix_filter.rs

This file is probably at the point where it could be split into multiple smaller files. But I can do that later.

src/table/writer/mod.rs

src/config/mod.rs

src/table/mod.rs

src/run_reader.rs

Add prefix-aware filter support to the LSM-tree. When a prefix extractor is configured, extracted prefixes are added to Bloom filters and the extractor name is stored in table metadata. Point reads use a prefix pre-check (maybe_contains_prefix) before falling back to the full-key filter. Range scans skip tables whose prefix filter definitively excludes the query range, both upfront and lazily during iteration. Includes a fix for PartitionedFilterWriter::finish which panicked when no filter partitions were created (e.g. all keys shorter than the required prefix length). The empty tli_handles guard returns early instead of attempting to encode an empty top-level index.

Oracle-based differential fuzzer: runs the same AFL-derived operation sequence against two trees (one with prefix extractor, one without) and asserts all reads return identical results. Any mismatch = wrongly applied filter = silent data loss, saved by AFL as a crash for replay. Covers all identified correctness dimensions: - 9 extractor variants × 3 bpk levels × 3 filter partitioning policies - MVCC snapshot reads at older seqnos while writes continue - Weak tombstones and their compaction GC interaction - Extractor changes on reopen (prefix_filter_allowed compatibility) - Partitioned filter forced on all levels (the path that had the panic) - Bidirectional iterator stepping (PrefixPingPong) - Unbounded iteration (FirstKV/LastKV) - Clustered keys (first byte 0..7, len 1..9) for realistic prefix distribution with natural in-domain / out-of-domain key mix

marvin-j97 added enhancement New feature or request epic api type:filters type:table labels Nov 7, 2025

zaidoon1 force-pushed the zaidoon/prefix-filter branch from fc17be2 to 2df2ae4 Compare February 10, 2026 20:30

zaidoon1 force-pushed the zaidoon/prefix-filter branch from 2df2ae4 to 58ad234 Compare February 10, 2026 20:33

zaidoon1 force-pushed the zaidoon/prefix-filter branch 3 times, most recently from af5ac86 to c1ea699 Compare February 11, 2026 17:46

zaidoon1 force-pushed the zaidoon/prefix-filter branch from c1ea699 to eaec760 Compare February 11, 2026 18:33

marvin-j97 reviewed Feb 12, 2026

View reviewed changes

marvin-j97 added performance type:point-read file format labels Feb 12, 2026

zaidoon1 force-pushed the zaidoon/prefix-filter branch 2 times, most recently from 1567681 to ea18d59 Compare February 13, 2026 11:29

zaidoon1 added 2 commits February 14, 2026 09:57

zaidoon1 force-pushed the zaidoon/prefix-filter branch from 7f9fffc to ee30eba Compare February 14, 2026 14:58

Uh oh!

Conversation

zaidoon1 commented Nov 3, 2025

Uh oh!

marvin-j97 commented Jan 2, 2026

Uh oh!

zaidoon1 commented Jan 3, 2026

Uh oh!

marvin-j97 commented Feb 9, 2026

Uh oh!

codecov bot commented Feb 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

marvin-j97 commented Feb 11, 2026

Uh oh!

zaidoon1 commented Feb 11, 2026

Uh oh!

zaidoon1 commented Feb 11, 2026

Uh oh!

marvin-j97 commented Feb 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

zaidoon1 commented Feb 12, 2026

Uh oh!

Uh oh!

marvin-j97 Feb 12, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

codecov bot commented Feb 10, 2026 •

edited

Loading

marvin-j97 commented Feb 12, 2026 •

edited

Loading