Extracting file handles from DataArray #10320
Unanswered
charles-turner-1
asked this question in
General
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
I've been working on some functionality that lets users inspect either:
in order to validate the user supplied chunking passed to xarray to open a dataset with. The idea is that by inspecting the netCDF disk chunks, we can easily assert whether the user provided chunks are integer multiples of the disk chunks (if not, we expect performance issues), and adjust them to match disk chunks if they are not.
Relevant functionality is here and a PR improving the functionality here.
However, I'm stumped on whether it's possible to actually extract all the file handles from an
xr.DataArray
, if opened withxr.open_mfdataset
:A dataset opened with
xr.open_mfdataset
will have._close
attributes from which the file handles can be extracted with the above logic. However, creating aDataArray
from that dataset will set the.encoding['source']
attribute to the first file handle in the list of paths passed in to open the dataset, and I seem to lose any access to the full set of file handles.I assume since Datasets & DataArrays are lazily loaded that there must still exist some sort of file handle somewhere which could be accessed somehow, even if it is a bit hacky.
Incidentally, AFAIK xarray doesn't really provide any mechanism to confirm that user provided chunks match up with disk chunks nicely, which is the gap this tool I've been working on aims to address (I know a warning is emitted if chunking separates disk chunks, but no info is given on how to fix it). If this can be done cleanly, I'm happy to open a PR adding the functionality if the community thinks it would be useful.
Beta Was this translation helpful? Give feedback.
All reactions