Releases: pydata/xarray
v0.4
This is a major release that includes a number of new features and bug fixes, including several changes from existing behavior.
Highlights include:
- Automatic alignment of index labels in arithmetic and when combining arrays or datasets.
- Aggregations like mean now skip missing values by default.
- Relaxed equality rules in concat and merge for variables with equal value(s) but different shapes.
- New
drop
method for dropping variables or index labels. - Support for reindexing with a fill method like pandas.
For more details, see the release notes.
v0.4 release candidate
This is a release candidate for v0.4. This version of xray includes some major changes, so I wanted it to get some testing before its official release.
For a list of changes, please read the release notes from the development version of the documentation:
http://xray.readthedocs.org/en/latest/whats-new.html
To test it out, use:
pip install https://github.com/xray/xray/archive/v0.4rc1.zip
v0.3.2
This release focused on bug-fixes, speedups and resolving some niggling inconsistencies.
There are a few cases where the behavior of xray differs from the previous version. However, I expect that in almost all cases your code will continue to run unmodified.
xray now requires pandas v0.15.0 or later. This was necessary for supporting TimedeltaIndex without too many painful hacks.
Backwards incompatible changes:
-
Arrays of
datetime.datetime
objects are now automatically cast todatetime64[ns]
arrays when stored in an xray object, using machinery borrowed from pandas:In [1]: from datetime import datetime In [2]: xray.Dataset({'t': [datetime(2000, 1, 1)]}) Out[2]: <xray.Dataset> Dimensions: (t: 1) Coordinates: * t (t) datetime64[ns] 2000-01-01 Variables: *empty*
-
xray now has support (including serialization to netCDF) for
pandas.TimedeltaIndex
.datetime.timedelta
objects are thus accordingly cast totimedelta64[ns]
objects when appropriate. -
Masked arrays are now properly coerced to use
NaN
as a sentinel value.
Enhancements:
-
Due to popular demand, we have added experimental attribute style access as a shortcut for dataset variables, coordinates and attributes:
In [3]: ds = xray.Dataset({'tmin': ([], 25, {'units': 'celcius'})}) In [4]: ds.tmin.units Out[4]: 'celcius'
Tab-completion for these variables should work in editors such as IPython. However, setting variables or attributes in this fashion is not yet supported because there are some unresolved ambiguities.
-
You can now use a dictionary for indexing with labeled dimensions. This provides a safe way to do assignment with labeled dimensions:
In [5]: array = xray.DataArray(np.zeros(5), dims=['x']) In [6]: array[dict(x=slice(3))] = 1 In [7]: array Out[7]: <xray.DataArray (x: 5)> array([ 1., 1., 1., 0., 0.]) Coordinates: * x (x) int64 0 1 2 3 4
-
Non-index coordinates can now be faithfully written to and restored from netCDF files. This is done according to CF conventions when possible by using the
coordinates
attribute on a data variable. When not possible, xray defines a globalcoordinates
attribute. -
Preliminary support for converting
xray.DataArray
objects to and from CDATcdms2
variables. -
We sped up any operation that involves creating a new Dataset or DataArray (e.g., indexing, aggregation, arithmetic) by a factor of 30 to 50%. The full speed up requires cyordereddict to be installed.
Bug fixes:
- Fix for
to_dataframe()
with 0d string/object coordinates - Fix for
to_netcdf
with 0d string variable - Fix writing datetime64 arrays to netcdf if NaT is present
- Fix align silently upcasts data arrays when NaNs are inserted
v0.3.1
This is mostly a bug-fix release to make xray compatible with the latest release of pandas (v0.15).
We added several features to better support working with missing values and exporting xray objects to pandas. We also reorganized the internal API for serializing and deserializing datasets, but this change should be almost entirely transparent to users.
Other than breaking the experimental DataStore API, there should be no backwards incompatible changes.
New features:
- Added
count
anddropna
methods, copied from pandas, for working with missing values. - Added
DataArray.to_pandas
for
converting a data array into the pandas object with the same dimensionality
(1D to Series, 2D to DataFrame, etc.). - Support for reading gzipped netCDF3 files.
- Reduced memory usage when writing netCDF files.
- 'missing_value' is now supported as an alias for the '_FillValue' attribute
on netCDF variables. - Trivial indexes, equivalent to
range(n)
wheren
is the length of the
dimension, are no longer written to disk.
Bug fixes:
- Compatibility fixes for pandas v0.15.
- Fixes for display and indexing of
NaT
(not-a-time). - Fix slicing by label was an argument is a data array.
- Test data is now shipped with the source distribution.
- Ensure order does not matter when doing arithmetic with scalar data arrays.
- Order of dimensions preserved with
DataArray.to_dataframe
.
v0.3
New features:
- Revamped coordinates: "coordinates" now refer to all arrays that are not
used to index a dimension. Coordinates are intended to allow for keeping track
of arrays of metadata that describe the grid on which the points in "variable"
arrays lie. They are preserved (when unambiguous) even though mathematical
operations. - Dataset math
xray.Dataset
objects now support all arithmetic
operations directly. Dataset-array operations map across all dataset
variables; dataset-dataset operations act on each pair of variables with the
same name. - GroupBy math: This provides a convenient shortcut for normalizing by the
average value of a group. - The dataset
__repr__
method has been entirely overhauled; dataset
objects now show their values when printed. - You can now index a dataset with a list of variables to return a new dataset:
ds[['foo', 'bar']]
.
Backwards incompatible changes:
Dataset.__eq__
andDataset.__ne__
are now element-wise operations
instead of comparing all values to obtain a single boolean. Use the method
Dataset.equals
instead.
Deprecations:
Dataset.noncoords
is deprecated: useDataset.vars
instead.Dataset.select_vars
deprecated: index aDataset
with a list of
variable names instead.DataArray.select_vars
andDataArray.drop_vars
deprecated: use
DataArray.reset_coords
instead.
v0.2
This is major release that includes some new features and quite a few bug
fixes. Here are the highlights:
- There is now a direct constructor for
DataArray
objects, which makes it
possible to create a DataArray without using a Dataset. This is highlighted
in the refreshed tutorial. - You can perform aggregation operations like
mean
directly on
xray.Dataset
objects, thanks to Joe Hamman. These aggregation
methods also worked on grouped datasets. - xray now works on Python 2.6, thanks to Anna Kuznetsova.
- A number of methods and attributes were given more sensible (usually shorter)
names:labeled
->sel
,indexed
->isel
,select
->
select_vars
,unselect
->drop_vars
,dimensions
->dims
,
coordinates
->coords
,attributes
->attrs
. - New
Dataset.load_data
andDataset.close
methods for datasets facilitate lower level of control of data loaded from disk.
v0.2.0alpha
Tag v0.2.0alpha for py 2.6 compat
v0.1.1
xray 0.1.1 is a bug-fix release that includes changes that should be almost
entirely backwards compatible with v0.1:
- Python 3 support (#53)
- Required numpy version relaxed to 1.7 (#129)
- Return numpy.datetime64 arrays for non-standard calendars (#126)
- Support for opening datasets associated with NetCDF4 groups (#127)
- Bug-fixes for concatenating datetime arrays (#134)
Special thanks to new contributors Thomas Kluyver, Joe Hamman and Alistair
Miles.