Skip to content

UnicodeDecodeError when trying to load netcdf file via dap4 (from THREDDS) #10879

@FObersteiner

Description

@FObersteiner

What happened?

Loading a netcdf4 file from a THREDDS server via dap4 failed with UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 6202: ordinal not in range(128)

What did you expect to happen?

Dataset loaded.

Minimal Complete Verifiable Example

# /// script
# requires-python = ">=3.12"
# dependencies = [
#   "xarray[complete]@git+https://github.com/pydata/xarray.git@main",
# ]
# ///

import xarray as xr

url = "dap4://thredds.atmohub.kit.edu/thredds/dap4/iagos-caribic/IAGOS-CARIBIC_MS_files_collection_20250711/CARIBIC-2/MS_20200304_591_CPT_MUC_10s_V16.nc"

ds = xr.load_dataset(url, engine="pydap", decode_cf=False, decode_times=False, decode_timedelta=False, decode_coords=False)
print(ds)

Relevant log output

Traceback (most recent call last):
  File "/home/user/Code/Python/pyTesting/netcdf/./dap4_xarray.py", line 18, in <module>
    ds = xr.load_dataset(url, engine="pydap", decode_cf=False, decode_times=False, decode_timedelta=False, decode_coords=False)
  File "/home/user/.cache/uv/environments-v2/dap4-xarray-92e2e08d4d47f0ca/lib/python3.13/site-packages/xarray/backends/api.py", line 165, in load_dataset
    with open_dataset(filename_or_obj, **kwargs) as ds:
         ~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/.cache/uv/environments-v2/dap4-xarray-92e2e08d4d47f0ca/lib/python3.13/site-packages/xarray/backends/api.py", line 612, in open_dataset
    ds = _dataset_from_backend_dataset(
        backend_ds,
    ...<11 lines>...
        **kwargs,
    )
  File "/home/user/.cache/uv/environments-v2/dap4-xarray-92e2e08d4d47f0ca/lib/python3.13/site-packages/xarray/backends/api.py", line 302, in _dataset_from_backend_dataset
    ds = _maybe_create_default_indexes(backend_ds)
  File "/home/user/.cache/uv/environments-v2/dap4-xarray-92e2e08d4d47f0ca/lib/python3.13/site-packages/xarray/backends/api.py", line 278, in _maybe_create_default_indexes
    return ds.assign_coords(Coordinates(to_index))
                            ~~~~~~~~~~~^^^^^^^^^^
  File "/home/user/.cache/uv/environments-v2/dap4-xarray-92e2e08d4d47f0ca/lib/python3.13/site-packages/xarray/core/coordinates.py", line 315, in __init__
    index, index_vars = create_default_index_implicit(var, list(coords))
                        ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^
  File "/home/user/.cache/uv/environments-v2/dap4-xarray-92e2e08d4d47f0ca/lib/python3.13/site-packages/xarray/core/indexes.py", line 1638, in create_default_index_implicit
    index = PandasIndex.from_variables(dim_var, options={})
  File "/home/user/.cache/uv/environments-v2/dap4-xarray-92e2e08d4d47f0ca/lib/python3.13/site-packages/xarray/core/indexes.py", line 720, in from_variables
    data = var._data if isinstance(var._data, PandasIndexingAdapter) else var.data  # type: ignore[redundant-expr]
                                                                          ^^^^^^^^
  File "/home/user/.cache/uv/environments-v2/dap4-xarray-92e2e08d4d47f0ca/lib/python3.13/site-packages/xarray/core/variable.py", line 456, in data
    duck_array = self._data.get_duck_array()
  File "/home/user/.cache/uv/environments-v2/dap4-xarray-92e2e08d4d47f0ca/lib/python3.13/site-packages/xarray/core/indexing.py", line 943, in get_duck_array
    duck_array = self.array.get_duck_array()
  File "/home/user/.cache/uv/environments-v2/dap4-xarray-92e2e08d4d47f0ca/lib/python3.13/site-packages/xarray/core/indexing.py", line 897, in get_duck_array
    return self.array.get_duck_array()
           ~~~~~~~~~~~~~~~~~~~~~~~~~^^
  File "/home/user/.cache/uv/environments-v2/dap4-xarray-92e2e08d4d47f0ca/lib/python3.13/site-packages/xarray/coding/variables.py", line 71, in get_duck_array
    return duck_array_ops.astype(self.array.get_duck_array(), dtype=self.dtype)
                                 ~~~~~~~~~~~~~~~~~~~~~~~~~^^
  File "/home/user/.cache/uv/environments-v2/dap4-xarray-92e2e08d4d47f0ca/lib/python3.13/site-packages/xarray/core/indexing.py", line 737, in get_duck_array
    array = self.array[self.key]
            ~~~~~~~~~~^^^^^^^^^^
  File "/home/user/.cache/uv/environments-v2/dap4-xarray-92e2e08d4d47f0ca/lib/python3.13/site-packages/xarray/backends/pydap_.py", line 51, in __getitem__
    return indexing.explicit_indexing_adapter(
           ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^
        key, self.shape, indexing.IndexingSupport.BASIC, self._getitem
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    )
    ^
  File "/home/user/.cache/uv/environments-v2/dap4-xarray-92e2e08d4d47f0ca/lib/python3.13/site-packages/xarray/core/indexing.py", line 1129, in explicit_indexing_adapter
    result = raw_indexing_method(raw_key.tuple)
  File "/home/user/.cache/uv/environments-v2/dap4-xarray-92e2e08d4d47f0ca/lib/python3.13/site-packages/xarray/backends/pydap_.py", line 56, in _getitem
    result = robust_getitem(self.array, key, catch=ValueError)
  File "/home/user/.cache/uv/environments-v2/dap4-xarray-92e2e08d4d47f0ca/lib/python3.13/site-packages/xarray/backends/common.py", line 296, in robust_getitem
    return array[key]
           ~~~~~^^^^^
  File "/home/user/.cache/uv/environments-v2/dap4-xarray-92e2e08d4d47f0ca/lib/python3.13/site-packages/pydap/model.py", line 526, in __getitem__
    data = self._get_data_index(index)
  File "/home/user/.cache/uv/environments-v2/dap4-xarray-92e2e08d4d47f0ca/lib/python3.13/site-packages/pydap/model.py", line 575, in _get_data_index
    return self._get_data()[index]
           ~~~~~~~~~~~~~~~~^^^^^^^
  File "/home/user/.cache/uv/environments-v2/dap4-xarray-92e2e08d4d47f0ca/lib/python3.13/site-packages/pydap/handlers/dap.py", line 548, in __getitem__
    dataset = UNPACKDAP4DATA(r, self.checksums, self.user_charset).dataset
              ~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/.cache/uv/environments-v2/dap4-xarray-92e2e08d4d47f0ca/lib/python3.13/site-packages/pydap/handlers/dap.py", line 1002, in __init__
    self.dmr, self.endianness = self.safe_dmr_and_data()
                                ~~~~~~~~~~~~~~~~~~~~~~^^
  File "/home/user/.cache/uv/environments-v2/dap4-xarray-92e2e08d4d47f0ca/lib/python3.13/site-packages/pydap/handlers/dap.py", line 1062, in safe_dmr_and_data
    dmr = self.raw.read(dmr_length).decode(self.user_charset)
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 5394: ordinal not in range(128)

Anything else we need to know?

Related?

#4052

My guess

Looking at the traceback, the issue seems to be that pydap expects ASCII encoding while the file might not satisfy this expectation. As a test for this hypothesis, I replaced

    dataset = UNPACKDAP4DATA(r, self.checksums, self.user_charset).dataset

with

    dataset = UNPACKDAP4DATA(r, self.checksums, "UTF-8").dataset

in the pydap code and everything worked just fine!

So I attempted to set the encoding as a keyword arg to load_dataset;

ds = xr.load_dataset(url, engine="pydap", decode_cf=False, decode_times=False, decode_timedelta=False, decode_coords=False, user_charset="UTF-8")

but unfortunately, this gets lost somewhere - might be the pydap code, so not necessarily xarray's fault.

Environment

>>> xr.show_versions()

INSTALLED VERSIONS
------------------
commit: None
python: 3.13.7 (main, Aug 18 2025, 19:20:03) [Clang 20.1.4 ]
python-bits: 64
OS: Linux
OS-release: 6.12.48+deb13-amd64
machine: x86_64
processor: 
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: ('en_US', 'UTF-8')
libhdf5: 1.14.6
libnetcdf: 4.9.3

xarray: 2025.10.2.dev18+ge49cfc4f2
pandas: 2.3.3
numpy: 2.3.4
scipy: 1.16.2
netCDF4: 1.7.3
pydap: 3.5.8
h5netcdf: 1.7.3
h5py: 3.15.1
zarr: 3.1.3
cftime: 1.6.5
nc_time_axis: 1.4.1
iris: None
bottleneck: 1.6.0
dask: 2025.10.0
distributed: 2025.10.0
matplotlib: 3.10.7
cartopy: 0.25.0
seaborn: 0.13.2
numbagg: 0.9.3
fsspec: 2025.9.0
cupy: None
pint: None
sparse: 0.17.0
flox: 0.10.7
numpy_groupies: 0.11.3
setuptools: None
pip: None
conda: None
pytest: None
mypy: None
IPython: None
sphinx: None

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions