Replies: 1 comment
-
Yes - there will be a long long tail of issues caused by various weirdness in archival data. It's worth reiterating that is not possible to guarantee in general that any archival dataset can be virtualizarr-ed (see zarr-developers/VirtualiZarr#218 for some of the reasons). Any data can be copied as native chunks into icechunk though, so we always have that fallback, even though that would duplicate the bytes.
What does it even mean to concatenate a variable in this situation? It's ambiguous what you want the result to be. I think it's good that VirtualiZarr makes it more obvious that what you're asking for doesn't make sense. I think there may be ways to work around this though.
These are both limitations of zarr's data model. They are big limitations, which is why the DevSeed people and I have applied for grant funding to generalize this. Ryan's suggested approach is in zarr-developers/zarr-python#2536, though I haven't had time to read it in detail yet. For the MUR SST dataset specifically @abarciauskas-bgse and I did work out a workaround that involved saving some references as virtual and some as native chunks, because once they are in icechunk it doesn't matter.
Not my problem 😁 |
Beta Was this translation helpful? Give feedback.
-
VirtualiZarr is amazing! but the data... not so much(not just NASA data but archival data in general). There are many things we'll have to work around to bring this functionality to life for most use cases.
Some of these issues are beyond our control, however documenting where this works and how and why it may fail would be great for our users!
I've been testing high resolution MUR, concept-id
MUR-JPL-L4-GLOB-v4.1
and when we take the analysis from a single month to N years, I ran into the issues described above.This work is very promising, I ran an analysis for 3 years (homogeneous years) and could get a
std()
for a region near the gulf of Mexico in under 6 minutes for almost a TB of data (with spatial subsetting less than that)I mean, great work @ayushnag @TomNicholas 🚀
Beta Was this translation helpful? Give feedback.
All reactions