Description
What is the support status of Zarr V3 Specification in zarr-python 2.x? We hope to use zarr-python 2.x to open zarr objects that are in Zarr V3 Specification. (Due to some dependency restriction, we are not able to upgrade zarr-python to 3.x at the moment. See dandi/dandi-cli#1609 for details).
According to the documentation of zarr-python 2.18.5, zarr-python at a version >2.12,<3 includes an experimental implementation of Zarr V3 Specification, but through some experimentation, I have found that the experimental implementation of Zarr V3 Specification in zarr-python is quite different from the final Zarr V3 Specification in zarr-python 3.x.
For example, the following experiment shows that the latest zarr-python at 2.x, 2.18.7 at the moment, is not able to open a simple Zarr object created with the zarr-python at 3.x, 3.0.6 at the moment, that is of the final Zarr V3 Specification.
- Given the following files.
create.py
# To be used for creating a Zarr object with Zarr V3 Specification using the
# zarr Python package at version 3 or later
import zarr
import numpy as np
z1 = zarr.create_array(
store="data/example-1.zarr",
shape=(10000, 10000),
chunks=(1000, 1000),
dtype="int32",
)
z1[:] = 42
z1[0, :] = np.arange(10000)
z1[:, 0] = np.arange(10000)
z2 = zarr.open("data/example-1.zarr", mode="r")
print(z2.info)
# Can be invoked with the following command:
# python3.11 -m venv .venv && source .venv/bin/activate && pip install zarr~=3.0 && python -c "import zarr;print(f'zarr version: {zarr.__version__}')" && python3.11 create.py; deactivate; rm -rf .venv
read.py
# To be used for reading a Zarr object with Zarr V3 Specification using the
# zarr Python package at before version 3
import zarr
z1 = zarr.open("data/example-1.zarr", mode="r", zarr_version=3)
# Can be invoked with the following command:
# python3.11 -m venv .venv && source .venv/bin/activate && pip install "zarr<3.0" && python -c "import zarr;print(f'zarr version: {zarr.__version__}')" && python3.11 read.py; deactivate; rm -rf .venv
- Run the following to write a Zarr object in the final Zarr V3 Specification to the file system using the zarr-python at 3.x.
python3.11 -m venv .venv && source .venv/bin/activate && pip install zarr~=3.0 && python -c "import zarr;print(f'zarr version: {zarr.__version__}')" && python3.11 create.py; deactivate; rm -rf .venv
which produces the following output.
...
zarr version: 3.0.6
Type : Array
Zarr format : 3
Data type : DataType.int32
Shape : (10000, 10000)
Chunk shape : (1000, 1000)
Order : C
Read-only : True
Store type : LocalStore
Filters : ()
Serializer : BytesCodec(endian=<Endian.little: 'little'>)
Compressors : (ZstdCodec(level=0, checksum=False),)
No. bytes : 400000000 (381.5M)
- Run the following to read the written Zarr object using zarr-python 2.x, and an error will result due to incompatiblity.
python3.11 -m venv .venv && source .venv/bin/activate && pip install "zarr<3.0" && python -c "import zarr;print(f'zarr version: {zarr.__version__}')" && python3.11 read.py; deactivate; rm -rf .venv
which produces the following output.
...
zarr version: 2.18.7
Traceback (most recent call last):
File "/Users/isaac/Developer/Dartmouth/workshop/zarr/read.py", line 6, in <module>
z1 = zarr.open("data/example-1.zarr", mode="r", zarr_version=3)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/isaac/Developer/Dartmouth/workshop/zarr/.venv/lib/python3.11/site-packages/zarr/convenience.py", line 134, in open
if contains_array(_store, path):
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/isaac/Developer/Dartmouth/workshop/zarr/.venv/lib/python3.11/site-packages/zarr/storage.py", line 120, in contains_array
key = _prefix_to_array_key(store, prefix)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/isaac/Developer/Dartmouth/workshop/zarr/.venv/lib/python3.11/site-packages/zarr/_storage/store.py", line 683, in _prefix_to_array_key
sfx = _get_metadata_suffix(store) # type: ignore
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/isaac/Developer/Dartmouth/workshop/zarr/.venv/lib/python3.11/site-packages/zarr/_storage/store.py", line 592, in _get_metadata_suffix
return _get_hierarchy_metadata(store)["metadata_key_suffix"]
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/isaac/Developer/Dartmouth/workshop/zarr/.venv/lib/python3.11/site-packages/zarr/_storage/store.py", line 587, in _get_hierarchy_metadata
return store._metadata_class.decode_hierarchy_metadata(store["zarr.json"])
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/isaac/Developer/Dartmouth/workshop/zarr/.venv/lib/python3.11/site-packages/zarr/meta.py", line 393, in decode_hierarchy_metadata
raise ValueError(f"Unexpected keys in metadata. meta={meta}")
ValueError: Unexpected keys in metadata. meta={'shape': [10000, 10000], 'data_type': 'int32', 'chunk_grid': {'name': 'regular', 'configuration': {'chunk_shape': [1000, 1000]}}, 'chunk_key_encoding': {'name': 'default', 'configuration': {'separator': '/'}}, 'fill_value': 0, 'codecs': [{'name': 'bytes', 'configuration': {'endian': 'little'}}, {'name': 'zstd', 'configuration': {'level': 0, 'checksum': False}}], 'attributes': {}, 'zarr_format': 3, 'node_type': 'array', 'storage_transformers': []}
Is there a plan to support the actual/final Zarr V3 Specification in zarr-python 2.x? If there is not, is it possible to have zarr-python output more informative error when opening an actual/final Zarr V3 Specification object, such as a NotImplementedError
.