some improvements to our configuration model

I have a few complaints with our config model. For this issue I consider our explicit `zarr.config` object, as well as the various registries (data types, codecs, etc) as all part of our config model. 

- Our config is effectively an untyped dict, which has two main drawbacks:
  - the config API is not not IDE / autocomplete friendly
  - the config API does not emit errors when invalid or unknown configuration values are set
- When creating an array, you can provide a custom codec without registering it. But when _reading_ an array, there's no way to explicitly declare the codec classes you would like to use. Instead, you have to pursue a very indirect approach by registering the codec AND declaring the codec in the global config object. This is not smooth.

I have some ideas for addressing these concerns. 
 
1. Define an explicit, typed model of our global config. Setting invalid keys in the config will be an error. I don't think we need runtime type checks, because these will appear as runtime errors anyway, but we will define a static API surface for the config. If we do need runtime type checks, there's always #3400 . 

  Under this proposal, instead of this
  ```python
   zarr.config.set({'array.order': 'F'})
  ```
  we would have something like these options:

  ```python
  zarr.config.array.set_order('F')
  zarr.config.array.order = 'F'
  ```
  
We can wrap the new API with the old config API around for a while to make the transition smooth.

2. Add a new keyword argument to array / group access routines that contains an object registry. Something like this: 
  ```python
   x = read_array(..., context={"data_type_registry": {"uint8": MyUint8Class}})
  ``` 
   `context` is either a string or a mapping with string keys.  
   The default value of context could be the literal string "config", which uses a context defined in the global config. We could add more string values if we want to define separate prepared contexts, e.g. "cuda" which has all cuda codecs. But the user also has the option to define a context explicitly, which is useful for loading an array or group with exactly the desired data type / codec / chunk grid classes _without_ modifying a global config.

Expect some work in these directions soon.

  

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

some improvements to our configuration model #3538

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

some improvements to our configuration model #3538

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions