How to Deterministically Release the Memory of a pyarrow.Table #45078
Labels
Component: Documentation
Component: FlightRPC
Component: Python
Type: usage
Issue is a user question
Describe the usage question you have. Please include as many useful details as possible.
Hello, I have a use case in Python involving arrow flight that is exemplified by the below snippet:
The above snippet is a slight simplification. The real-world scenario is a little more complex because the table is obtained in a library I don’t necessarily have easy control over and is passed to user-level code.
At point
(A)
above benchmarking in high volume scenarios has shown it would be really good to free up the memory of thearrow_table
. The table itself does not have an explicit.close()
method or anything indicating we’re able to free the memory associated with it. A few things I have tried are:del
and hoping at some point GC would kick in.del
and explicitly calling the GC (just for testing, I am aware this is not a recommended practice).In the last 2 cases above, just as a debugging exercise, I ended up printing the number of references to the arrow_table object before calling
del
. Expectation was it’d be 1, but it was more than that, so my assumption is something gets held internally within the flight framework.The above said, my question is - is there a deterministic way that always work to release the memory of a
pyarrow.Table
. I can imagine why in most of the cases doing this would be quite cumbersome and it’d be best to rely on the reference counting mechanism + the GC naturally kicking in, but in this particular case it would be quite useful.I would also be grateful, if I can get some pointers to the lifetime implications of these objects in Python. It is not very clear from the documentation, for example, if the
arrow_table
s lifetime from above is tied to the lifetime of thereader
and vice versa. Again, I appreciate in 99% of the cases we shouldn’t need to care about it, but there’s still this 1% that having this explained a little more in depth would be of great use!P.S. There is a near-identical example I had to do within Java and the
VectorSchemaRoot
’s API conveniently exposes a.close()
method, which works quite nicely in my use case.Component(s)
Documentation, FlightRPC, Python
The text was updated successfully, but these errors were encountered: