v25.01.00

Latest

Latest

marcinz released this 08 Feb 06:20

a972b23

This is a beta release of cuPyNumeric.

Linux x86 and ARM conda packages are available at https://anaconda.org/legate/cupynumeric.

Documentation for this release can be found at https://docs.nvidia.com/cupynumeric/25.01/.

New features

Added functionality

Add the method parameter to cupynumeric.convolve.
Increase the maximum array dimension from 4 to 6.
Experimental support for NumPy 2.0 (not reflected in the package constraints yet).

Memory management enhancements

Updates to take advantage of the deferred-eager pool unification in Legate. This change has the potential to increase the effective available memory capacity by up to 100% for many usecases. It also removes the need for the user to adjust the --eager-alloc-percentage.
Add the offload_to() API, that allows a user to offload an array to a particular memory kind, such that any copies in other memories are discarded. This can be useful e.g. to evict an array from GPU memory onto system memory, freeing up space for subsequent GPU tasks.

I/O improvements

Use cuFile to accelerate HDF5 reads on the GPU.
Add support for reading "binary" HDF5 datasets (in particular useful for reading boolean-type datasets).

UX Improvements

Consider NUMA node topology when allocating CPU cores and memory during automatic machine configuration.
Add environment variable LEGATE_LIMIT_STDOUT, to only print out the output from one of the copies of the top-level program in a multi-process execution.
Remove an extraneous warning about __buffer__ being unimplemented.

Deprecations

Drop support for the Maxwell GPU architecture. cuPyNumeric now requires at least Pascal (sm_60).

Assets 2