-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Reading Avro files supports other types #7828
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I think there have been some improvements in the avro support recently (in datafusion 33 I think) https://github.com/apache/arrow-datafusion/pulls?q=is%3Apr+avro+is%3Aclosed For example, #7663 looks pretty similar to his request. There was also some talk about adding avro support upstream in arrow: apache/arrow-rs#4886 Perhaps @sarutak has some wisdom to add here |
Thank you for letting me know! |
@Asura7969 @alamb
Notice that the third record is not
BTW, Apache Spark has a similar feature that creates a table from an Avro records but it doesn't currently support nullable top-level nullable. |
Hi @sarutak -- @tustvold mentioned something similar today as part of his work on apache/arrow-rs#4886. I think Arrow also supports top level nulls in 4.Allow top level null records and represent them as a |
This is in some ways a special case of the more general issue of supporting avro schema that don't contain a Record as the root. I recently did something similar for arrow_json in apache/arrow-rs#4911 which is likely the approach I am going to take in apache/arrow-rs#4886 |
Is your feature request related to a problem or challenge?
I am now integrating incubator-paimon(it is a streaming data lake platform), when reading the avro file, an exception message will appear: expected avro schema to be a record, because
AvroArrowArrayReader
only supportsAvroSchema::Record
, but the avro file format of paimon is Union type(code here)Describe the solution you'd like
It would be better if the parsing format could be implemented by the user. The default implementation is still the current way, no problem.
Describe alternatives you've considered
My current solution: here and this
Additional context
No response
The text was updated successfully, but these errors were encountered: