Release v24.11.00 · nv-legate/cupynumeric

This is a beta release of cuPyNumeric.

Linux x86 and ARM conda packages are available at https://anaconda.org/legate/cupynumeric.

Documentation for this release can be found at https://docs.nvidia.com/cupynumeric/24.11/.

New features

Improved API coverage

Implement np.unravel_index
Implement np.angle
Implement np.median
Implement np.ix_
Implement np.meshgrid
Implement np.expand_dims
Implement np.rot90
Implement np.round
Implement np.fft.fftshift and np.fft.ifftshift
Implement np.roll
Support full_matrices parameter of np.linalg.svd

Memory management enhancements

Memory efficient implementation of matrix multiplication - this implementation batches over the reduction dimension, achieving constant memory overhead regardless of array sizes.
Memory efficiency for stencil computation - add np.ndarray.stencil_hint method, that instructs cuPyNumeric to pre-allocate the necessary space for ghost elements when an array is to be used in a stencil computation, reducing intermediate memory use.
Memory allocation report - report the object-memory mapping when a computation runs out of memory, to help users debug and optimize memory usage.

Enhanced infrastructure support

GH200 Grace Hopper Superchip support - allows users to leverage GH200-based cloud instances and supercomputers.
GASNet support - support GASNet as an alternative networking backend to UCX, using a GASNet wrapper, MPI wrapper, and custom build utilities.
Initial HDF5 support - distributed read/write of HDF5 files using a POSIX backend.
Automatic resource configuration at run time - automatically discover and use all the available compute resources including CPU, GPU, system memory, and framebuffer memory.
More enhancements from Legate 24.11

Other

Re-implement the RNG module on top of the C++ STL random library, removing the need to have cuRand in CPU-only installations.

Known Issues

cuPyNumeric will emit a false-positive warning like the following:

RuntimeWarning: cuPyNumeric has not implemented numpy.ndarray.__buffer__ and is falling back to canonical NumPy. You may notice significantly decreased performance for this function call.

in cases such as when an arithmetic operation is performed on a scalar array, e.g. cupynumeric.array(42) * 2. There is no actual performance degradation occurring in this case. We are working on a patch that will suppress this warning.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v24.11.00