Skip to content

Selection of input data along time coordinate fails #68

@observingClouds

Description

@observingClouds

Thanks @matschreiner for your great work in #55. I just tried this and run into an issue when doing a selection along the time dimension.

What I did

First I provided in my config file a start and end datetime, which resulted in:

E       dataclass_wizard.errors.ParseError: Failure parsing field `start` in class `Range`. Expected a type [<class 'str'>, <class 'int'>, <class 'float'>], got datetime.
E         value: datetime.datetime(2022, 4, 1, 0, 0)
E         error: Object was not in any of Union types
E         tag_key: '__tag__'
E         json_object: '{"start": "2022-04-01T00:00:00", "end": "2022-04-01T03:00:00"}'

Second, I tried providing the time as a string, but this resulted in

    def check_point_in_dataset(coord, point, ds):
        """
        check that the requested point is in the data.
        """
        if point is not None and point not in ds[coord].values:
>           raise ValueError(
                f"Provided value for coordinate {coord} ({point}) is not in the data."
            )
E           ValueError: Provided value for coordinate time (2022-04-10 00:00:00) is not in the data.

The second issue stems from check_point_in_dataset() which does not do time conversions, e.g. str (provided in config) and datetime in dataset and therefore fails, even if the time is available:

>>> import xarray as xr
>>> ds = xr.open_zarr("https://object-store.os-api.cci1.ecmwf.int/mllam-testdata/danra_cropped/v0.2.0/pressure_levels.zarr")
>>> ds.sel({'time': slice("2022-04-01T00:00:00","2022-04-01T03:00:00")}).time
Out[8]: 
<xarray.DataArray 'time' (time: 2)> Size: 16B
array(['2022-04-01T00:00:00.000000000', '2022-04-01T03:00:00.000000000'],
      dtype='datetime64[ns]')
Coordinates:
  * time     (time) datetime64[ns] 16B 2022-04-01 2022-04-01T03:00:00
Attributes:
    standard_name:  time
>>> "2022-04-01T00:00:00" in ds['time'].values
False

Also, is there a reason why we call check_point_in_dataset() only in case of a coordinate is named time? Do we need this test at all? Isn't xarray raising already a good error message?

What I expected
I expected both of my trials to be working.

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions