You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Jul 25, 2022. It is now read-only.
I feel odd even asking this - but is it possible to make enhancements so that datafusion-python can be used without pyarrow? pyarrow is fantastic and I already use it, but, it is fairly large which makes it somewhat painful to deploy for some serverless use cases (such as on AWS Lambda). If I am able to do everything I need in datafusion is there a need for pyarrow? I confess I'm not very familiar with the interface between rust / datafusion and python / arrow so hopefully this isnt too stupid of a question.
thx!
The text was updated successfully, but these errors were encountered:
I think it might be possible; a good portion of the module doesn't require PyArrow. The only things that do are UDFs, UDAFs, and the parts of the Dataframe API that return PyArrow data structures (like collect(), and schema()). Does a datafusion-python without those features sound appealing?
Cool - that was what it looked like to me as well from my scan of the code. IMHO in the medium term it would be nice to have pyarrow as an optional feature. I think that datafusion should have some improvements on the IO front though before enabling this (im looking into / working on writing capabilities apache/datafusion#1777). Right now I think pyarrow has more functionality there which is useful.
Sign up for freeto subscribe to this conversation on GitHub.
Already have an account?
Sign in.
I feel odd even asking this - but is it possible to make enhancements so that
datafusion-python
can be used withoutpyarrow
?pyarrow
is fantastic and I already use it, but, it is fairly large which makes it somewhat painful to deploy for some serverless use cases (such as on AWS Lambda). If I am able to do everything I need indatafusion
is there a need forpyarrow
? I confess I'm not very familiar with the interface between rust / datafusion and python / arrow so hopefully this isnt too stupid of a question.thx!
The text was updated successfully, but these errors were encountered: