Skip to content

IN LIST: add UInt16 bitmap filter#23012

Open
geoffreyclaude wants to merge 1 commit into
apache:mainfrom
geoffreyclaude:perf/in_list_bitmap_u16_filter
Open

IN LIST: add UInt16 bitmap filter#23012
geoffreyclaude wants to merge 1 commit into
apache:mainfrom
geoffreyclaude:perf/in_list_bitmap_u16_filter

Conversation

@geoffreyclaude

@geoffreyclaude geoffreyclaude commented Jun 18, 2026

Copy link
Copy Markdown
Contributor

Which issue does this PR close?

Rationale for this change

#23011 uses a bitmap checklist for UInt8, where there are 256 possible values. UInt16 is the same idea with a larger value range: 0 through 65,535.

That is still small enough to represent directly. A UInt16 bitmap needs one bit for each possible value:

  • 65,536 possible values
  • 65,536 bits total
  • 8 KB of memory

Then a lookup is still simple: use the input value as the bit position and check whether that bit is set. For example, if the list contains 42, bit 42 is set, and every input row with value 42 can be recognized with one bit test.

This PR keeps the scope narrow: it adds the unsigned 2-byte bitmap path as a concrete UInt16 filter. #23035 then unifies the UInt8 and UInt16 implementations, and #23013 uses that shared shape for signed same-width reinterpretation.

What changes are included in this PR?

  • Adds UInt16BitmapFilter, backed by a heap-allocated 65,536-bit bitmap.
  • Routes UInt16 constant-list filtering to that bitmap path.
  • Keeps the same IN / NOT IN null behavior as the generic path.
  • Adds focused coverage for UInt16 boundary values, nulls, and NOT IN.

Are these changes tested?

Yes.

  • cargo fmt --all
  • cargo test -p datafusion-physical-expr bitmap_filter_u16 --lib
  • cargo test -p datafusion-physical-expr in_list_int_types --lib
  • cargo test -p datafusion-physical-expr test_in_list_from_array_type_combinations --lib
  • cargo test -p datafusion-physical-expr test_in_list_dictionary_types --lib
  • cargo clippy -p datafusion-physical-expr --all-targets --all-features -- -D warnings

Are there any user-facing changes?

No. This is an internal performance optimization only.

Benchmark note

No local in_list_strategy numbers are included for this PR because the benchmark harness does not currently include a direct UInt16 case. The available i16 rows measure the signed reinterpretation path added in #23013 after the bitmap unification in #23035, not this PR's unsigned UInt16 bitmap filter.

@github-actions github-actions Bot added the physical-expr Changes to the physical-expr crates label Jun 18, 2026
@geoffreyclaude geoffreyclaude force-pushed the perf/in_list_bitmap_u16_filter branch 2 times, most recently from 55f3836 to 81ec379 Compare June 18, 2026 08:40
@geoffreyclaude geoffreyclaude changed the title Extend Bitmap Filter to UInt16 (Heap-based) IN LIST: add UInt16 bitmap filter Jun 18, 2026
@geoffreyclaude geoffreyclaude force-pushed the perf/in_list_bitmap_u16_filter branch 2 times, most recently from 7043d4b to 2dbce01 Compare June 19, 2026 05:35
@geoffreyclaude geoffreyclaude force-pushed the perf/in_list_bitmap_u16_filter branch 4 times, most recently from 56f08ef to 080b1b4 Compare June 22, 2026 16:03
alamb added a commit to alamb/datafusion that referenced this pull request Jun 24, 2026
## Which issue does this PR close?

- Part of apache#19241.
- Stacked on apache#21927.
- Next in stack: apache#23012.
- Extracted from apache#19390.

## Rationale for this change

`IN LIST` evaluates expressions like `x IN (1, 3, 7)`. The list on the
right is fixed, so DataFusion can precompute a small lookup structure
once and then reuse it for every input row.

For `UInt8`, there are only 256 possible values: 0 through 255. That
means the lookup can be a tiny checklist with one bit per possible
value:

- If the list contains `3`, set bit `3`.
- If the list contains `7`, set bit `7`.
- To check whether an input value is present, read that one bit.

So instead of hashing each input value or comparing it against the list,
membership becomes one indexed bit test. The bitmap is only 32 bytes,
because 256 bits = 32 bytes.

This PR adds the first specialized primitive path in the stack as a
concrete `UInt8` filter. The `UInt16` version is added in apache#23012, and
the shared bitmap abstraction is introduced only after both concrete
implementations are visible in apache#23035.

## What changes are included in this PR?

- Adds `UInt8BitmapFilter`, a 32-byte bitmap built from the non-null
constants in the `IN` list.
- Routes `UInt8` constant-list filtering to that bitmap path.
- Keeps the same SQL null behavior as the generic path for both `IN` and
`NOT IN`.
- Moves shared dictionary-needle handling into `static_filter.rs`, so
specialized filters can reuse it consistently.
- Adds focused tests for `UInt8` null handling and dictionary-encoded
needles.

## Are these changes tested?

Yes.

- `cargo fmt --all`
- `cargo test -p datafusion-physical-expr bitmap_filter_u8 --lib`
- `cargo test -p datafusion-physical-expr in_list_int_types --lib`
- `cargo clippy -p datafusion-physical-expr --all-targets --all-features
-- -D warnings`

## Are there any user-facing changes?

No. This is an internal performance optimization only.

<!-- codex-benchmark-start -->
## Local benchmark snapshot

Benchmark command:

```bash
cargo bench -p datafusion-physical-expr --profile release-nonlto --bench in_list_strategy -- --save-baseline <name>
```

Method: compare adjacent saved baselines using raw Criterion sample
minima (`min(time / iters)`). Lower is better; changes within +/-5% are
treated as noise. These numbers were not rerun after splitting the
bitmap abstraction into apache#23035.

Compared baselines:
[apache#21927](apache#21927) ->
[apache#23011](apache#23011)

Relevant scope: UInt8 narrow-integer rows.

Summary: 5 relevant rows, 5 faster, 0 slower, 0 within +/-5%.

| Benchmark | Before | After | Change |
|---|---:|---:|---:|
| `narrow_integer/u8/list=16/match=0%` | 20.39 us | 3.94 us | -80.7%
(5.18x faster) |
| `narrow_integer/u8/list=16/match=50%` | 38.38 us | 3.98 us | -89.6%
(9.65x faster) |
| `narrow_integer/u8/list=4/match=0%` | 18.18 us | 3.93 us | -78.4%
(4.62x faster) |
| `narrow_integer/u8/list=4/match=50%` | 34.63 us | 3.96 us | -88.6%
(8.75x faster) |
| `nulls/narrow_integer/u8/list=16/match=50%/nulls=20%` | 37.12 us |
4.16 us | -88.8% (8.93x faster) |
<!-- codex-benchmark-end -->

---------

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
Implements an 8 KB heap-allocated bitmap for UInt16. Maintains O(1) performance while handling the larger value space. Triggers for UInt16 arrays.
@geoffreyclaude geoffreyclaude force-pushed the perf/in_list_bitmap_u16_filter branch from 080b1b4 to 85133a6 Compare June 24, 2026 20:49
@geoffreyclaude geoffreyclaude marked this pull request as ready for review June 24, 2026 20:54
@alamb alamb added the performance Make DataFusion faster label Jun 24, 2026
@alamb

alamb commented Jun 24, 2026

Copy link
Copy Markdown
Contributor

run benchmark in_list_strategy

@alamb alamb left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me -- thank you @geoffreyclaude


/// Bitmap filter for O(1) `UInt16` set membership via single bit test.
///
/// `UInt16` has 65,536 possible values, so the filter stores membership in an

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the only question I have is is it worth 8KB for a small inlist -- e.g. if there are 3 elements, 8kb may be a lot of memory overhead, though perhaps the performance is worth it

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually the branchless strategy (see later PR in the stack) wins against bitmaps on lists of sizes up to 8 (both for u8 and u16.)
I still need to amend it though, as it currently skips 1 and 2 byte types.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sounds like we hav a plan -- let's keep going then!

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One PR at a time! This is much cleaner and trackable than everything in a single mega PR for sure

@adriangbot

Copy link
Copy Markdown

🤖 Benchmark running (GKE) | trigger
Instance: c4a-highmem-16 (12 vCPU / 65 GiB) | Linux bench-c4793704788-667-djb9b 6.12.68+ #1 SMP Sat May 2 07:49:07 UTC 2026 aarch64 GNU/Linux

CPU Details (lscpu)
Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected

Comparing perf/in_list_bitmap_u16_filter (85133a6) to e2c3e18 (merge-base) diff using: in_list_strategy
Results will be posted here when complete


File an issue against this benchmark runner

@adriangbot

Copy link
Copy Markdown

🤖 Benchmark completed (GKE) | trigger

Instance: c4a-highmem-16 (12 vCPU / 65 GiB)

CPU Details (lscpu)
Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected
Details

group                                                                HEAD                                   perf_in_list_bitmap_u16_filter
-----                                                                ----                                   ------------------------------
dictionary/i32/dict=10/list=16                                       1.00      7.6±0.01µs        ? ?/sec    1.00      7.6±0.01µs        ? ?/sec
dictionary/i32/dict=100/list=16                                      1.00      7.7±0.01µs        ? ?/sec    1.00      7.7±0.01µs        ? ?/sec
dictionary/i32/dict=100/list=16/NOT_IN                               1.00      7.8±0.01µs        ? ?/sec    1.00      7.7±0.01µs        ? ?/sec
dictionary/i32/dict=100/list=4                                       1.00      7.7±0.03µs        ? ?/sec    1.00      7.7±0.02µs        ? ?/sec
dictionary/i32/dict=100/list=64                                      1.00      7.8±0.01µs        ? ?/sec    1.00      7.8±0.01µs        ? ?/sec
dictionary/i32/dict=1000/list=16                                     1.01      9.1±0.11µs        ? ?/sec    1.00      9.0±0.01µs        ? ?/sec
dictionary/utf8_long/dict=100/list=16                                1.00      8.3±0.02µs        ? ?/sec    1.00      8.3±0.01µs        ? ?/sec
dictionary/utf8_short/dict=50/list=32                                1.00      8.2±0.02µs        ? ?/sec    1.00      8.1±0.01µs        ? ?/sec
dictionary/utf8_short/dict=50/list=8                                 1.00      8.0±0.01µs        ? ?/sec    1.00      8.0±0.01µs        ? ?/sec
dictionary/utf8_short/dict=500/list=20                               1.00      9.6±0.04µs        ? ?/sec    1.00      9.6±0.01µs        ? ?/sec
f32/large_list/list=64/match=0%                                      1.05     16.2±0.01µs        ? ?/sec    1.00     15.4±0.02µs        ? ?/sec
f32/large_list/list=64/match=50%                                     1.00     23.8±0.22µs        ? ?/sec    1.13     26.8±0.41µs        ? ?/sec
f32/small_list/list=32/match=0%                                      1.05     16.2±0.05µs        ? ?/sec    1.00     15.4±0.01µs        ? ?/sec
f32/small_list/list=32/match=50%                                     1.04     29.0±0.44µs        ? ?/sec    1.00     28.0±0.25µs        ? ?/sec
f32/small_list/list=4/match=0%                                       1.03     16.0±0.03µs        ? ?/sec    1.00     15.5±0.01µs        ? ?/sec
f32/small_list/list=4/match=50%                                      1.02     27.5±0.16µs        ? ?/sec    1.00     27.0±0.43µs        ? ?/sec
fixed_size_binary/fsb16/list=10000/match=0%                          1.00     25.3±0.04µs        ? ?/sec    1.03     26.2±0.48µs        ? ?/sec
fixed_size_binary/fsb16/list=10000/match=50%                         1.00     54.5±0.29µs        ? ?/sec    1.01     54.9±0.18µs        ? ?/sec
fixed_size_binary/fsb16/list=256/match=0%                            1.02     24.4±0.57µs        ? ?/sec    1.00     23.9±0.09µs        ? ?/sec
fixed_size_binary/fsb16/list=256/match=50%                           1.00     49.8±0.37µs        ? ?/sec    1.00     50.1±0.40µs        ? ?/sec
fixed_size_binary/fsb16/list=4/match=0%                              1.00     22.8±0.05µs        ? ?/sec    1.01     23.1±0.03µs        ? ?/sec
fixed_size_binary/fsb16/list=4/match=50%                             1.00     55.1±0.35µs        ? ?/sec    1.00     55.2±0.41µs        ? ?/sec
fixed_size_binary/fsb16/list=64/match=0%                             1.06     24.3±0.33µs        ? ?/sec    1.00     23.0±0.03µs        ? ?/sec
fixed_size_binary/fsb16/list=64/match=50%                            1.00     55.5±0.19µs        ? ?/sec    1.00     55.4±0.28µs        ? ?/sec
narrow_integer/i16/list=256/match=0%                                 1.00     12.2±0.02µs        ? ?/sec    1.02     12.4±0.02µs        ? ?/sec
narrow_integer/i16/list=256/match=50%                                1.00     17.5±0.18µs        ? ?/sec    1.14     19.9±0.34µs        ? ?/sec
narrow_integer/i16/list=4/match=0%                                   1.00     12.6±0.02µs        ? ?/sec    1.00     12.6±0.02µs        ? ?/sec
narrow_integer/i16/list=4/match=50%                                  1.00     22.4±0.17µs        ? ?/sec    1.04     23.4±0.27µs        ? ?/sec
narrow_integer/i16/list=64/match=0%                                  1.00     12.2±0.05µs        ? ?/sec    1.01     12.4±0.08µs        ? ?/sec
narrow_integer/i16/list=64/match=50%                                 1.00     18.7±0.22µs        ? ?/sec    1.29     24.2±0.15µs        ? ?/sec
narrow_integer/u8/list=16/match=0%                                   1.00      5.2±0.00µs        ? ?/sec    1.00      5.2±0.00µs        ? ?/sec
narrow_integer/u8/list=16/match=50%                                  1.00      5.2±0.00µs        ? ?/sec    1.00      5.2±0.00µs        ? ?/sec
narrow_integer/u8/list=4/match=0%                                    1.00      5.2±0.00µs        ? ?/sec    1.00      5.2±0.00µs        ? ?/sec
narrow_integer/u8/list=4/match=50%                                   1.00      5.2±0.00µs        ? ?/sec    1.00      5.2±0.00µs        ? ?/sec
nulls/narrow_integer/u8/list=16/match=50%/nulls=20%                  1.00      5.3±0.04µs        ? ?/sec    1.00      5.3±0.01µs        ? ?/sec
nulls/primitive/i32/large_list/list=64/match=50%/nulls=20%           1.24     26.3±0.18µs        ? ?/sec    1.00     21.2±0.19µs        ? ?/sec
nulls/primitive/i32/small_list/list=16/match=50%/nulls=20%           1.00     27.7±0.36µs        ? ?/sec    1.09     30.2±0.48µs        ? ?/sec
nulls/primitive/i32/small_list/list=16/match=50%/nulls=20%/NOT_IN    1.03     26.0±0.14µs        ? ?/sec    1.00     25.4±0.15µs        ? ?/sec
nulls/primitive/i32/small_list/list=16/match=50%/nulls=50%           1.00     21.8±0.12µs        ? ?/sec    1.03     22.5±0.05µs        ? ?/sec
nulls/utf8/long_24b/list=16/match=50%/nulls=20%                      1.00     69.5±0.35µs        ? ?/sec    1.00     69.8±0.44µs        ? ?/sec
nulls/utf8/short_8b/list=16/match=50%/nulls=20%                      1.00     58.5±0.36µs        ? ?/sec    1.04     60.9±0.76µs        ? ?/sec
nulls/utf8view/long_24b/list=16/match=50%/nulls=20%                  1.00     77.4±0.30µs        ? ?/sec    1.13     87.4±0.20µs        ? ?/sec
nulls/utf8view/short_8b/list=16/match=50%/nulls=20%                  1.00     40.5±0.12µs        ? ?/sec    1.07     43.1±0.11µs        ? ?/sec
nulls/utf8view/short_8b/list=16/match=50%/nulls=20%/NOT_IN           1.00     40.7±0.19µs        ? ?/sec    1.06     43.0±0.13µs        ? ?/sec
nulls/utf8view/short_8b/list=16/match=50%/nulls=50%                  1.00     29.2±0.12µs        ? ?/sec    1.06     30.8±0.06µs        ? ?/sec
primitive/i32/large_list/list=256/match=0%                           1.00     12.0±0.12µs        ? ?/sec    1.06     12.8±0.06µs        ? ?/sec
primitive/i32/large_list/list=256/match=50%                          1.00     26.8±0.24µs        ? ?/sec    1.02     27.4±0.23µs        ? ?/sec
primitive/i32/large_list/list=64/match=0%                            1.00     11.9±0.01µs        ? ?/sec    1.00     11.8±0.01µs        ? ?/sec
primitive/i32/large_list/list=64/match=50%                           1.11     33.3±0.16µs        ? ?/sec    1.00     30.1±0.27µs        ? ?/sec
primitive/i32/small_list/list=16/match=50%/NOT_IN                    1.00     27.8±0.30µs        ? ?/sec    1.03     28.7±0.35µs        ? ?/sec
primitive/i32/small_list/list=32/match=0%                            1.00     12.0±0.02µs        ? ?/sec    1.00     12.0±0.01µs        ? ?/sec
primitive/i32/small_list/list=32/match=50%                           1.00     32.4±0.19µs        ? ?/sec    1.02     33.1±0.20µs        ? ?/sec
primitive/i32/small_list/list=4/match=0%                             1.01     12.1±0.05µs        ? ?/sec    1.00     12.0±0.01µs        ? ?/sec
primitive/i32/small_list/list=4/match=50%                            1.04     34.5±0.18µs        ? ?/sec    1.00     33.1±0.24µs        ? ?/sec
primitive/i64/large_list/list=128/match=0%                           1.03     12.7±0.04µs        ? ?/sec    1.00     12.4±0.01µs        ? ?/sec
primitive/i64/large_list/list=128/match=50%                          1.00     18.1±0.19µs        ? ?/sec    1.01     18.3±0.11µs        ? ?/sec
primitive/i64/large_list/list=32/match=0%                            1.01     12.9±0.04µs        ? ?/sec    1.00     12.7±0.01µs        ? ?/sec
primitive/i64/large_list/list=32/match=50%                           1.00     22.0±0.41µs        ? ?/sec    1.02     22.4±0.38µs        ? ?/sec
primitive/i64/small_list/list=16/match=0%                            1.00     12.1±0.03µs        ? ?/sec    1.01     12.3±0.02µs        ? ?/sec
primitive/i64/small_list/list=16/match=50%                           1.29     22.6±0.33µs        ? ?/sec    1.00     17.5±0.11µs        ? ?/sec
primitive/i64/small_list/list=4/match=0%                             1.00     11.9±0.02µs        ? ?/sec    1.02     12.1±0.08µs        ? ?/sec
primitive/i64/small_list/list=4/match=50%                            1.00     23.7±0.20µs        ? ?/sec    1.01     23.8±0.27µs        ? ?/sec
timestamp_ns/large_list/list=32/match=0%                             1.00     17.2±0.02µs        ? ?/sec    1.01     17.3±0.03µs        ? ?/sec
timestamp_ns/large_list/list=32/match=50%                            1.00     29.0±0.30µs        ? ?/sec    1.04     30.2±0.24µs        ? ?/sec
timestamp_ns/small_list/list=16/match=0%                             1.00     17.3±0.02µs        ? ?/sec    1.00     17.2±0.03µs        ? ?/sec
timestamp_ns/small_list/list=16/match=50%                            1.00     29.3±0.60µs        ? ?/sec    1.00     29.2±0.13µs        ? ?/sec
timestamp_ns/small_list/list=4/match=0%                              1.00     17.0±0.07µs        ? ?/sec    1.00     17.1±0.03µs        ? ?/sec
timestamp_ns/small_list/list=4/match=50%                             1.00     30.0±0.16µs        ? ?/sec    1.04     31.3±0.42µs        ? ?/sec
utf8/long_24b/list=256/match=0%                                      1.00     33.9±0.03µs        ? ?/sec    1.01     34.3±0.05µs        ? ?/sec
utf8/long_24b/list=256/match=50%                                     1.02     71.9±0.35µs        ? ?/sec    1.00     70.2±0.47µs        ? ?/sec
utf8/long_24b/list=4/match=0%                                        1.00     34.4±0.35µs        ? ?/sec    1.00     34.6±0.06µs        ? ?/sec
utf8/long_24b/list=4/match=50%                                       1.00     72.6±0.36µs        ? ?/sec    1.01     73.4±0.44µs        ? ?/sec
utf8/long_24b/list=64/match=0%                                       1.00     33.7±0.04µs        ? ?/sec    1.01     34.0±0.04µs        ? ?/sec
utf8/long_24b/list=64/match=50%                                      1.01     71.6±0.36µs        ? ?/sec    1.00     70.6±0.69µs        ? ?/sec
utf8/mixed_len/list=16/match=0%                                      1.03     36.6±0.08µs        ? ?/sec    1.00     35.5±0.07µs        ? ?/sec
utf8/mixed_len/list=16/match=50%                                     1.09    105.8±0.52µs        ? ?/sec    1.00     96.7±0.59µs        ? ?/sec
utf8/mixed_len/list=64/match=0%                                      1.00     38.0±0.08µs        ? ?/sec    1.02     38.7±0.07µs        ? ?/sec
utf8/mixed_len/list=64/match=50%                                     1.11    115.6±0.42µs        ? ?/sec    1.00    103.7±0.80µs        ? ?/sec
utf8/shared_prefix/pfx=12/list=32/match=50%                          1.00     72.3±0.64µs        ? ?/sec    1.01     73.4±0.27µs        ? ?/sec
utf8/short_8b/list=16/match=50%/NOT_IN                               1.00     62.6±0.42µs        ? ?/sec    1.02     64.0±0.29µs        ? ?/sec
utf8/short_8b/list=256/match=0%                                      1.00     26.3±0.03µs        ? ?/sec    1.01     26.7±0.03µs        ? ?/sec
utf8/short_8b/list=256/match=50%                                     1.00     62.9±0.34µs        ? ?/sec    1.02     64.0±0.48µs        ? ?/sec
utf8/short_8b/list=4/match=0%                                        1.00     26.3±0.03µs        ? ?/sec    1.01     26.5±0.09µs        ? ?/sec
utf8/short_8b/list=4/match=50%                                       1.00     62.1±0.33µs        ? ?/sec    1.03     63.8±0.30µs        ? ?/sec
utf8/short_8b/list=64/match=0%                                       1.03     27.4±0.16µs        ? ?/sec    1.00     26.7±0.18µs        ? ?/sec
utf8/short_8b/list=64/match=50%                                      1.00     61.5±0.40µs        ? ?/sec    1.03     63.2±0.35µs        ? ?/sec
utf8view/len_12b/list=16/match=0%                                    1.00     17.8±0.04µs        ? ?/sec    1.01     18.0±0.08µs        ? ?/sec
utf8view/len_12b/list=16/match=50%                                   1.03     48.5±0.25µs        ? ?/sec    1.00     47.2±0.23µs        ? ?/sec
utf8view/len_12b/list=64/match=0%                                    1.00     17.9±0.03µs        ? ?/sec    1.02     18.1±0.03µs        ? ?/sec
utf8view/len_12b/list=64/match=50%                                   1.01     46.3±0.20µs        ? ?/sec    1.00     45.9±0.27µs        ? ?/sec
utf8view/long_24b/list=16/match=0%                                   1.01     40.6±0.26µs        ? ?/sec    1.00     40.3±0.06µs        ? ?/sec
utf8view/long_24b/list=16/match=50%                                  1.00     86.4±0.18µs        ? ?/sec    1.02     88.2±0.26µs        ? ?/sec
utf8view/long_24b/list=256/match=0%                                  1.00     40.1±0.04µs        ? ?/sec    1.03     41.2±0.26µs        ? ?/sec
utf8view/long_24b/list=256/match=50%                                 1.00     84.6±0.21µs        ? ?/sec    1.03     86.8±0.25µs        ? ?/sec
utf8view/long_24b/list=4/match=0%                                    1.00     40.3±0.08µs        ? ?/sec    1.00     40.3±0.05µs        ? ?/sec
utf8view/long_24b/list=4/match=50%                                   1.00     86.0±0.29µs        ? ?/sec    1.02     88.0±0.24µs        ? ?/sec
utf8view/long_24b/list=64/match=0%                                   1.00     40.3±0.05µs        ? ?/sec    1.00     40.5±0.06µs        ? ?/sec
utf8view/long_24b/list=64/match=50%                                  1.00     83.4±0.27µs        ? ?/sec    1.03     85.6±0.97µs        ? ?/sec
utf8view/mixed_len/list=16/match=0%                                  1.00     30.2±0.07µs        ? ?/sec    1.01     30.4±0.10µs        ? ?/sec
utf8view/mixed_len/list=16/match=50%                                 1.12     72.9±0.68µs        ? ?/sec    1.00     65.1±0.47µs        ? ?/sec
utf8view/mixed_len/list=64/match=0%                                  1.00     34.3±0.12µs        ? ?/sec    1.00     34.3±0.21µs        ? ?/sec
utf8view/mixed_len/list=64/match=50%                                 1.04     84.7±0.61µs        ? ?/sec    1.00     81.2±0.46µs        ? ?/sec
utf8view/shared_prefix/pfx=12/list=32/match=0%                       1.00     42.2±0.18µs        ? ?/sec    1.02     42.9±0.26µs        ? ?/sec
utf8view/shared_prefix/pfx=12/list=32/match=50%                      1.00     82.4±0.29µs        ? ?/sec    1.05     86.8±0.20µs        ? ?/sec
utf8view/shared_prefix/pfx=16/list=64/match=0%                       1.00     40.5±0.11µs        ? ?/sec    1.02     41.4±0.29µs        ? ?/sec
utf8view/shared_prefix/pfx=16/list=64/match=50%                      1.00     83.3±0.21µs        ? ?/sec    1.03     85.5±0.15µs        ? ?/sec
utf8view/shared_prefix/pfx=8/list=16/match=0%                        1.03     30.6±0.41µs        ? ?/sec    1.00     29.8±0.07µs        ? ?/sec
utf8view/shared_prefix/pfx=8/list=16/match=50%                       1.00     71.8±0.29µs        ? ?/sec    1.01     72.6±0.34µs        ? ?/sec
utf8view/short_8b/list=16/match=0%                                   1.00     17.7±0.02µs        ? ?/sec    1.01     17.7±0.05µs        ? ?/sec
utf8view/short_8b/list=16/match=50%                                  1.00     43.3±0.32µs        ? ?/sec    1.02     44.1±0.31µs        ? ?/sec
utf8view/short_8b/list=256/match=0%                                  1.00     17.9±0.03µs        ? ?/sec    1.00     17.8±0.07µs        ? ?/sec
utf8view/short_8b/list=256/match=50%                                 1.00     42.6±0.14µs        ? ?/sec    1.03     43.7±0.25µs        ? ?/sec
utf8view/short_8b/list=4/match=0%                                    1.00     17.8±0.02µs        ? ?/sec    1.01     18.1±0.04µs        ? ?/sec
utf8view/short_8b/list=4/match=50%                                   1.01     46.8±0.25µs        ? ?/sec    1.00     46.4±0.21µs        ? ?/sec
utf8view/short_8b/list=64/match=0%                                   1.00     18.0±0.04µs        ? ?/sec    1.00     18.0±0.06µs        ? ?/sec
utf8view/short_8b/list=64/match=50%                                  1.00     41.4±0.38µs        ? ?/sec    1.03     42.8±0.20µs        ? ?/sec

Resource Usage

in_list_strategy — base (merge-base)

Metric Value
Wall time 1355.3s
Peak memory 41.4 MiB
Avg memory 28.5 MiB
CPU user 1414.3s
CPU sys 1.3s
Peak spill 0 B

in_list_strategy — branch

Metric Value
Wall time 1350.3s
Peak memory 42.2 MiB
Avg memory 29.0 MiB
CPU user 1416.7s
CPU sys 1.3s
Peak spill 0 B

File an issue against this benchmark runner

@alamb alamb added this pull request to the merge queue Jun 25, 2026
@github-merge-queue github-merge-queue Bot removed this pull request from the merge queue due to failed status checks Jun 25, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

performance Make DataFusion faster physical-expr Changes to the physical-expr crates

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants