-
Notifications
You must be signed in to change notification settings - Fork 30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Predicate Pushdown for scans #2
Comments
@rustyconover Hey thanks for all the input! All your submitted issues are all very valid points and are definitely on our to-do list in getting this extension beyond its initial proof-of-concept state! |
Thanks Sam I look forward to contributing
…On Mon, Jul 17, 2023 at 09:22 Sam Ansmink ***@***.***> wrote:
@rustyconover <https://github.com/rustyconover> Hey thanks for all the
input! All your submitted issues are all very valid points and are
definitely on our to-do list in getting this extension beyond its initial
proof-of-concept state!
—
Reply to this email directly, view it on GitHub
<#2 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAFSWJJXOMZXPPBO26U2J73XQU4CZANCNFSM6AAAAAA2BHSCTE>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
How we do this on Iceberg (both Java and Python) is using visitors. When a query is executed against a snapshot, that has a reference to a ManifestList, which references one or more Manifests. The manifests contain information that can be used to prune datafiles.
|
Hey guys! Just wanted to check in and see if there might be updates on this - i.e., is this still prioritized? I wonder if it might be possible to have an intermediate solution in the meantime, where similar to |
@Fokko can you please update your links to the existing Java and Python implementations since the references have moved when the Iceberg repos were split out? |
@rustyconover Certainly, updated! |
Hi @samansmink and @Fokko, I spent a few hours looking at this extension, DuckDB itself and the Iceberg spec to implement a plan for this. Right now, I want to query Iceberg tables that consist of 100,000+ of parquet files that are highly partitioned, but without DuckDB supporting partitions and data file metrics it will be impossible. So I'm trying to figure out how to make this happen. Here is what I've noticed: TableFunction's bind_replace prevents access to query filters.The use of the I don't see a way to get access to the DuckDB query filters since they are passed in the "init" phase of a TableFunction rather than the bind stage. Once the function has been changed to a @samansmink What do you think is a reasonable approach to enable A thought I had would be populating the metrics from the Iceberg manifests into the existing statistics infrastructure of Manifest metrics and partition informationIt appears the manifest metrics and partition information aren't being parsed from the Avro files. This seems to just require some work to the generated code for parsing the Avro files. This can happen in parallel to the other improvements. |
Seems like some progress for the necessary DuckDB infrastructure is being started here: duckdb/duckdb#11806 |
@rustyconover: AFAIK the filter-pushdown with iceberg gets triggered. Note pushdown happens
And note that immediately after bind_replace, the generated replaced piece of query plan is bound. So then, hive-partitioned file pruning based on pushdown happens. However, when iceberg support was initially released, the second mechanism (in the query optimization) did suffer from absence of a pushdown rewrite-rule for ANTI-joins. Note that bind_replace generates an anti-join between the main data files and the files with deleted row-ids. But, late 2023 these rules were implemented, so this should now work. |
Hi @peterboncz, Thanks for replying. The SQL query filter pushdown does get passed to the When an Iceberg table has millions of files, reading all these footers takes a long time. Iceberg addressed this issue by adding column metrics in the manifest files and detailed "Scan Planning" in their spec: https://iceberg.apache.org/spec/#scan-planning I'm looking forward to DuckDB using Iceberg's statistics to plan the scan of the table, right now the current code includes every file in the manifest. By reading the manifest files and using their info to plan a scan, scanning an Iceberg table with a million files can be reduced to scanning just 1 or 2 files especially if the query is referencing a small number of partition predicates. |
Thanks for chiming in here @peterboncz. I believe @rustyconover is right here. #8 focuses on the partitioning (called hidden-partitioning in Iceberg that's slightly different than Hive-style partitioning). Next to that, Iceberg had on a file-level basis additional statistics. For example, there are upper- and lower-bounds to easily skip files without having to read the footer of the Parquet file. But I don't see the |
Hey @rustyconover it looks like the blocker you mentioned: duckdb/duckdb#11806 got merged. Do you think the current status ready for implementation? |
@rustyconover are you planning to pick this up? thanks |
Not in the short term. |
@febinsathar I have a version partially working here |
#72 finally will this get merged ? @mike-luabase or any blocker left |
It currently appears that predicate pushdown isn't applied before calculating what Parquet files are included in scanning the table's data.
The AVRO manifest files include statistics about each Parquet file. Those should be leveraged to reduce the number of files that need to be read. Additionally there is the byte offset for each row group in each Parquet file. That can prevent the need to read the footer of the Parquet field and allow better concurrent scanning.
The text was updated successfully, but these errors were encountered: