Skip to content

Conversation

@adriangb
Copy link
Contributor

This is a WIP PR to explore a change and run benchmarks, it is not intended to be in a reviewable state
Most of this code was AI generated and needs careful human review before being mergeable

@github-actions github-actions bot added physical-expr Changes to the physical-expr crates core Core DataFusion crate physical-plan Changes to the physical-plan crate labels Oct 27, 2025
@adriangb adriangb force-pushed the bloom-filter-pushdown branch from 67efa2b to 7b020d8 Compare October 27, 2025 18:27
@adriangb adriangb changed the title Bloom filter pushdown use bloom filters to push down hash table lookups in HashJoinExec Oct 27, 2025
@adriangb
Copy link
Contributor Author

adriangb commented Oct 27, 2025

Preliminary results:

❯ cargo run --release -p datafusion-cli -- -f q.sql
    Finished `release` profile [optimized] target(s) in 0.13s
     Running `target/release/datafusion-cli -f q.sql`
DataFusion CLI v50.3.0
+-------+
| count |
+-------+
| 1000  |
+-------+
1 row(s) fetched. 
Elapsed 0.007 seconds.

0 row(s) fetched. 
Elapsed 0.001 seconds.

+-----------+
| count     |
+-----------+
| 100000000 |
+-----------+
1 row(s) fetched. 
Elapsed 1.437 seconds.

0 row(s) fetched. 
Elapsed 0.001 seconds.

+----+----+----+
| k  | v  | k  |
+----+----+----+
| 50 | 50 | 50 |
| 51 | 51 | 51 |
| 52 | 52 | 52 |
| 53 | 53 | 53 |
| 54 | 54 | 54 |
| 55 | 55 | 55 |
| 56 | 56 | 56 |
| 57 | 57 | 57 |
| 58 | 58 | 58 |
| 59 | 59 | 59 |
| 60 | 60 | 60 |
| 61 | 61 | 61 |
| 62 | 62 | 62 |
| 63 | 63 | 63 |
| 64 | 64 | 64 |
| 65 | 65 | 65 |
| 66 | 66 | 66 |
| 67 | 67 | 67 |
| 68 | 68 | 68 |
| 69 | 69 | 69 |
| 70 | 70 | 70 |
| 71 | 71 | 71 |
| 72 | 72 | 72 |
| 73 | 73 | 73 |
| 74 | 74 | 74 |
| 75 | 75 | 75 |
| 76 | 76 | 76 |
| 77 | 77 | 77 |
| 78 | 78 | 78 |
| 79 | 79 | 79 |
| 80 | 80 | 80 |
| 81 | 81 | 81 |
| 82 | 82 | 82 |
| 83 | 83 | 83 |
| 84 | 84 | 84 |
| 85 | 85 | 85 |
| 86 | 86 | 86 |
| 87 | 87 | 87 |
| 88 | 88 | 88 |
| 89 | 89 | 89 |
| .            |
| .            |
| .            |
+----+----+----+
951 row(s) fetched. (First 40 displayed. Use --maxrows to adjust)
Elapsed 0.005 seconds.


datafusion-clone on  bloom-filter-pushdown [?] is 📦 v50.3.0 via 🦀 v1.90.0 on ☁️  [email protected](us-east4) took 2s 
❯ datafusion-cli -f q.sql                          
DataFusion CLI v50.0.0
+-------+
| count |
+-------+
| 1000  |
+-------+
1 row(s) fetched. 
Elapsed 0.003 seconds.

0 row(s) fetched. 
Elapsed 0.001 seconds.

+-----------+
| count     |
+-----------+
| 100000000 |
+-----------+
1 row(s) fetched. 
Elapsed 1.438 seconds.

0 row(s) fetched. 
Elapsed 0.001 seconds.

+----+----+----+
| k  | v  | k  |
+----+----+----+
| 50 | 50 | 50 |
| 51 | 51 | 51 |
| 52 | 52 | 52 |
| 53 | 53 | 53 |
| 54 | 54 | 54 |
| 55 | 55 | 55 |
| 56 | 56 | 56 |
| 57 | 57 | 57 |
| 58 | 58 | 58 |
| 59 | 59 | 59 |
| 60 | 60 | 60 |
| 61 | 61 | 61 |
| 62 | 62 | 62 |
| 63 | 63 | 63 |
| 64 | 64 | 64 |
| 65 | 65 | 65 |
| 66 | 66 | 66 |
| 67 | 67 | 67 |
| 68 | 68 | 68 |
| 69 | 69 | 69 |
| 70 | 70 | 70 |
| 71 | 71 | 71 |
| 72 | 72 | 72 |
| 73 | 73 | 73 |
| 74 | 74 | 74 |
| 75 | 75 | 75 |
| 76 | 76 | 76 |
| 77 | 77 | 77 |
| 78 | 78 | 78 |
| 79 | 79 | 79 |
| 80 | 80 | 80 |
| 81 | 81 | 81 |
| 82 | 82 | 82 |
| 83 | 83 | 83 |
| 84 | 84 | 84 |
| 85 | 85 | 85 |
| 86 | 86 | 86 |
| 87 | 87 | 87 |
| 88 | 88 | 88 |
| 89 | 89 | 89 |
| .            |
| .            |
| .            |
+----+----+----+
951 row(s) fetched. (First 40 displayed. Use --maxrows to adjust)
Elapsed 0.101 seconds.

❯ ./bench.sh compare main bloom-filter-pushdown                      
Comparing main and bloom-filter-pushdown
--------------------
Benchmark tpch_sf10.json
--------------------
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query        ┃       main ┃ bloom-filter-pushdown ┃        Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 1     │  516.69 ms │             471.33 ms │ +1.10x faster │
│ QQuery 2     │  101.85 ms │              94.26 ms │ +1.08x faster │
│ QQuery 3     │  266.65 ms │             245.87 ms │ +1.08x faster │
│ QQuery 4     │  220.78 ms │             196.18 ms │ +1.13x faster │
│ QQuery 5     │  393.63 ms │             353.80 ms │ +1.11x faster │
│ QQuery 6     │  145.39 ms │             131.00 ms │ +1.11x faster │
│ QQuery 7     │  542.04 ms │             498.71 ms │ +1.09x faster │
│ QQuery 8     │  437.90 ms │             385.31 ms │ +1.14x faster │
│ QQuery 9     │  647.38 ms │             577.67 ms │ +1.12x faster │
│ QQuery 10    │  355.89 ms │             322.23 ms │ +1.10x faster │
│ QQuery 11    │   78.78 ms │              72.90 ms │ +1.08x faster │
│ QQuery 12    │  217.50 ms │             199.46 ms │ +1.09x faster │
│ QQuery 13    │  379.35 ms │             335.68 ms │ +1.13x faster │
│ QQuery 14    │  194.42 ms │             170.24 ms │ +1.14x faster │
│ QQuery 15    │  274.45 ms │             245.11 ms │ +1.12x faster │
│ QQuery 16    │   67.99 ms │              65.19 ms │     no change │
│ QQuery 17    │  708.83 ms │             628.52 ms │ +1.13x faster │
│ QQuery 18    │ 1002.48 ms │             912.62 ms │ +1.10x faster │
│ QQuery 19    │  319.69 ms │             265.64 ms │ +1.20x faster │
│ QQuery 20    │  253.56 ms │             227.40 ms │ +1.12x faster │
│ QQuery 21    │  760.56 ms │             662.62 ms │ +1.15x faster │
│ QQuery 22    │   94.52 ms │              78.71 ms │ +1.20x faster │
└──────────────┴────────────┴───────────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━┓
┃ Benchmark Summary                    ┃           ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━┩
│ Total Time (main)                    │ 7980.32ms │
│ Total Time (bloom-filter-pushdown)   │ 7140.43ms │
│ Average Time (main)                  │  362.74ms │
│ Average Time (bloom-filter-pushdown) │  324.56ms │
│ Queries Faster                       │        21 │
│ Queries Slower                       │         0 │
│ Queries with No Change               │         1 │
│ Queries with Failure                 │         0 │
└──────────────────────────────────────┴───────────┘

Where q.sql is basically https://datafusion.apache.org/blog/2025/09/10/dynamic-filters/#hash-join-dynamic-filters

@adriangb
Copy link
Contributor Author

I do wonder if using the same CASE (hash(...) % num_parts) WHEN 0 THEN ... structure in #18306 OR using a single bloom filter could improve things here

@adriangb
Copy link
Contributor Author

I also did try a couple different sizes of ndv/fpp and found that at 10k/1% TPCH SF10 performance went down, I believe because of overhead of building the bloom filter. But at 1k/5% there is no noticeable difference, and it still works well for the other query. It would make sense to try to set ndv based on statistics (either the estimated ndv of the table or something like min(num_rows / 1000, 10_000)`).

@adriangb
Copy link
Contributor Author

I think I've figured out how to make the bloom filters very, very cheap to build: re-use the hashes calculated for the hash table so that the only thing we ever insert into the bloom filter are u64s.

@adriangb
Copy link
Contributor Author

adriangb commented Oct 27, 2025

I think I've figured out how to make the bloom filters very, very cheap to build: re-use the hashes calculated for the hash table so that the only thing we ever insert into the bloom filter are u64s.

I've implemented this. I think it could be improved further using the CASE ... structure to avoid checking all partitions zone map / bloom filters.

❯ ./bench.sh compare main bloom-filter-pushdown                      
Comparing main and bloom-filter-pushdown
--------------------
Benchmark tpch_sf10.json
--------------------
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query        ┃       main ┃ bloom-filter-pushdown ┃        Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 1     │  516.69 ms │             481.66 ms │ +1.07x faster │
│ QQuery 2     │  101.85 ms │              93.63 ms │ +1.09x faster │
│ QQuery 3     │  266.65 ms │             250.29 ms │ +1.07x faster │
│ QQuery 4     │  220.78 ms │             206.86 ms │ +1.07x faster │
│ QQuery 5     │  393.63 ms │             371.83 ms │ +1.06x faster │
│ QQuery 6     │  145.39 ms │             132.72 ms │ +1.10x faster │
│ QQuery 7     │  542.04 ms │             513.40 ms │ +1.06x faster │
│ QQuery 8     │  437.90 ms │             384.47 ms │ +1.14x faster │
│ QQuery 9     │  647.38 ms │             590.60 ms │ +1.10x faster │
│ QQuery 10    │  355.89 ms │             330.20 ms │ +1.08x faster │
│ QQuery 11    │   78.78 ms │              75.66 ms │     no change │
│ QQuery 12    │  217.50 ms │             193.74 ms │ +1.12x faster │
│ QQuery 13    │  379.35 ms │             332.54 ms │ +1.14x faster │
│ QQuery 14    │  194.42 ms │             172.00 ms │ +1.13x faster │
│ QQuery 15    │  274.45 ms │             241.40 ms │ +1.14x faster │
│ QQuery 16    │   67.99 ms │              61.69 ms │ +1.10x faster │
│ QQuery 17    │  708.83 ms │             625.67 ms │ +1.13x faster │
│ QQuery 18    │ 1002.48 ms │             924.03 ms │ +1.08x faster │
│ QQuery 19    │  319.69 ms │             286.88 ms │ +1.11x faster │
│ QQuery 20    │  253.56 ms │             258.64 ms │     no change │
│ QQuery 21    │  760.56 ms │             690.64 ms │ +1.10x faster │
│ QQuery 22    │   94.52 ms │              82.26 ms │ +1.15x faster │
└──────────────┴────────────┴───────────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━┓
┃ Benchmark Summary                    ┃           ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━┩
│ Total Time (main)                    │ 7980.32ms │
│ Total Time (bloom-filter-pushdown)   │ 7300.80ms │
│ Average Time (main)                  │  362.74ms │
│ Average Time (bloom-filter-pushdown) │  331.85ms │
│ Queries Faster                       │        20 │
│ Queries Slower                       │         0 │
│ Queries with No Change               │         2 │
│ Queries with Failure                 │         0 │
└──────────────────────────────────────┴───────────┘

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

core Core DataFusion crate physical-expr Changes to the physical-expr crates physical-plan Changes to the physical-plan crate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant