-
Notifications
You must be signed in to change notification settings - Fork 2.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Creating Delete Vectors using Java API or Spark #11968
Comments
Not sure where you got the Q1/Q2: DVs will automatically be produced when the table's DV work in general is being tracked by #11122. Let me know if that helps or whether you have any other questions. |
I was trying out the Nightly Snapshot for PyIceberg (apache/iceberg-python#1516), and noticed that we don't produce any deletion vectors (yet):
I would expect a Puffin file here: It is a V3 table:
|
@Fokko is this spark 3.5? I'll have a backport up for 3.4 shortly, so if it's 3.4 or earlier you're probably not going to see those? Edit: See that it's Spark 3.5 .... that's strange. |
Ok did some local testing, I think the issue is really in the output file path. We are outputting DVs but the suffix of the file is the same as the configured V2 delete file, so for instance the file is called "foo.parquet" but it's really a PUFFIN file with the expected DVs when you inspect it. Figuring out why we're outputting this suffix... |
@amogh-jahagirdar Thanks Amogh, I checked and it looks good now: |
Query engine
Spark
Question
Q1 - Is it possible to create Deletion Vectors in Apache Iceberg? Is a Deletion Vector file generated for a specific Deletion type Positional vs Equality?
Q2 - Looking for pointers on creating Delete Vectors in Apache Iceberg using Spark or the Java API. So far I have tried creating a table using Spark and inserting a few rows and deleting a few. With "write.delete.vector.enabled" set to true for the table. The hope is that it will generate Deletion Vector(s). Am I missing any steps here?
Here's the code snippet - https://github.com/piyushdubey/dataformats/blob/main/src/main/java/net/piyushdubey/data/IcebergTableOperations.java
Appreciate any pointers on this!
The text was updated successfully, but these errors were encountered: