Skip to content

Unclear support status of Zarr V3 Specification in zarr-python 2.xΒ #2981

Open
@candleindark

Description

@candleindark

What is the support status of Zarr V3 Specification in zarr-python 2.x? We hope to use zarr-python 2.x to open zarr objects that are in Zarr V3 Specification. (Due to some dependency restriction, we are not able to upgrade zarr-python to 3.x at the moment. See dandi/dandi-cli#1609 for details).

According to the documentation of zarr-python 2.18.5, zarr-python at a version >2.12,<3 includes an experimental implementation of Zarr V3 Specification, but through some experimentation, I have found that the experimental implementation of Zarr V3 Specification in zarr-python is quite different from the final Zarr V3 Specification in zarr-python 3.x.

For example, the following experiment shows that the latest zarr-python at 2.x, 2.18.7 at the moment, is not able to open a simple Zarr object created with the zarr-python at 3.x, 3.0.6 at the moment, that is of the final Zarr V3 Specification.

  1. Given the following files.
create.py
# To be used for creating a Zarr object with Zarr V3 Specification using the
# zarr Python package at version 3 or later

import zarr
import numpy as np

z1 = zarr.create_array(
    store="data/example-1.zarr",
    shape=(10000, 10000),
    chunks=(1000, 1000),
    dtype="int32",
)


z1[:] = 42
z1[0, :] = np.arange(10000)
z1[:, 0] = np.arange(10000)

z2 = zarr.open("data/example-1.zarr", mode="r")
print(z2.info)


# Can be invoked with the following command:
# python3.11 -m venv .venv && source .venv/bin/activate && pip install zarr~=3.0 && python -c "import zarr;print(f'zarr version: {zarr.__version__}')" && python3.11 create.py; deactivate; rm -rf .venv
read.py
# To be used for reading a Zarr object with Zarr V3 Specification using the
# zarr Python package at before version 3

import zarr

z1 = zarr.open("data/example-1.zarr", mode="r", zarr_version=3)

# Can be invoked with the following command:
# python3.11 -m venv .venv && source .venv/bin/activate && pip install "zarr<3.0" && python -c "import zarr;print(f'zarr version: {zarr.__version__}')" && python3.11 read.py; deactivate; rm -rf .venv
  1. Run the following to write a Zarr object in the final Zarr V3 Specification to the file system using the zarr-python at 3.x.
python3.11 -m venv .venv && source .venv/bin/activate && pip install zarr~=3.0 && python -c "import zarr;print(f'zarr version: {zarr.__version__}')" && python3.11 create.py; deactivate; rm -rf .venv

which produces the following output.

...
zarr version: 3.0.6
Type               : Array
Zarr format        : 3
Data type          : DataType.int32
Shape              : (10000, 10000)
Chunk shape        : (1000, 1000)
Order              : C
Read-only          : True
Store type         : LocalStore
Filters            : ()
Serializer         : BytesCodec(endian=<Endian.little: 'little'>)
Compressors        : (ZstdCodec(level=0, checksum=False),)
No. bytes          : 400000000 (381.5M)
  1. Run the following to read the written Zarr object using zarr-python 2.x, and an error will result due to incompatiblity.
python3.11 -m venv .venv && source .venv/bin/activate && pip install "zarr<3.0" && python -c "import zarr;print(f'zarr version: {zarr.__version__}')" && python3.11 read.py; deactivate; rm -rf .venv

which produces the following output.

...
zarr version: 2.18.7
Traceback (most recent call last):
  File "/Users/isaac/Developer/Dartmouth/workshop/zarr/read.py", line 6, in <module>
    z1 = zarr.open("data/example-1.zarr", mode="r", zarr_version=3)
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/isaac/Developer/Dartmouth/workshop/zarr/.venv/lib/python3.11/site-packages/zarr/convenience.py", line 134, in open
    if contains_array(_store, path):
       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/isaac/Developer/Dartmouth/workshop/zarr/.venv/lib/python3.11/site-packages/zarr/storage.py", line 120, in contains_array
    key = _prefix_to_array_key(store, prefix)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/isaac/Developer/Dartmouth/workshop/zarr/.venv/lib/python3.11/site-packages/zarr/_storage/store.py", line 683, in _prefix_to_array_key
    sfx = _get_metadata_suffix(store)  # type: ignore
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/isaac/Developer/Dartmouth/workshop/zarr/.venv/lib/python3.11/site-packages/zarr/_storage/store.py", line 592, in _get_metadata_suffix
    return _get_hierarchy_metadata(store)["metadata_key_suffix"]
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/isaac/Developer/Dartmouth/workshop/zarr/.venv/lib/python3.11/site-packages/zarr/_storage/store.py", line 587, in _get_hierarchy_metadata
    return store._metadata_class.decode_hierarchy_metadata(store["zarr.json"])
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/isaac/Developer/Dartmouth/workshop/zarr/.venv/lib/python3.11/site-packages/zarr/meta.py", line 393, in decode_hierarchy_metadata
    raise ValueError(f"Unexpected keys in metadata. meta={meta}")
ValueError: Unexpected keys in metadata. meta={'shape': [10000, 10000], 'data_type': 'int32', 'chunk_grid': {'name': 'regular', 'configuration': {'chunk_shape': [1000, 1000]}}, 'chunk_key_encoding': {'name': 'default', 'configuration': {'separator': '/'}}, 'fill_value': 0, 'codecs': [{'name': 'bytes', 'configuration': {'endian': 'little'}}, {'name': 'zstd', 'configuration': {'level': 0, 'checksum': False}}], 'attributes': {}, 'zarr_format': 3, 'node_type': 'array', 'storage_transformers': []}

Is there a plan to support the actual/final Zarr V3 Specification in zarr-python 2.x? If there is not, is it possible to have zarr-python output more informative error when opening an actual/final Zarr V3 Specification object, such as a NotImplementedError.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions