Releases: chelsea0x3b/cudarc
Releases Β· chelsea0x3b/cudarc
v0.19.4 - cuda 13.2
What's Changed
- Add total_mem() and mem_get_info() safe methods to CudaContext by @OneThing98 in #534
- driver: add upload() and raw accessors to CudaGraph by @OneThing98 in #535
- docs: add cufft by @mayocream in #538
- feat: wire FP4 packed types from float4 0.2.0 by @jorgeantonio21 in #540
- Add cuCtxCreate_v4 bindings, CudaContext::new_non_primary() and new_cig() by @w4nderlust in #539
- driver: add result-level memory pool wrappers by @OneThing98 in #544
- Remove cuCtxCreate_v4 from blocklist by @chelsea0x3b in #550
- Add CUDA 13.2 Support by @TannerRogalsky in #546
- Exposing
has_async_allocfield inCudaContextby @LateinCecer in #553 - Adds cuda 13.2 support by @chelsea0x3b in #551
- Parallelize bindings-generator by @chelsea0x3b in #552
New Contributors
- @OneThing98 made their first contribution in #534
- @jorgeantonio21 made their first contribution in #540
- @w4nderlust made their first contribution in #539
- @TannerRogalsky made their first contribution in #546
- @LateinCecer made their first contribution in #553
Full Changelog: v0.19.3...v0.19.4
v0.19.3 - safe cufft
What's Changed
- Implement
DeviceReprfor arrays by @kaathewisegit in #523 - feat: cufft safe API by @mayocream in #532
Full Changelog: v0.19.2...v0.19.3
v0.19.2 - fixes for dynamic loading with cufft & cudnn 9
What's Changed
- fix: add support for cufft 12.x by @mayocream in #530
- Add lib{name}.so.9 by @chelsea0x3b in #531
Full Changelog: v0.19.1...v0.19.2
v0.19.1 - bump float8 & libloading versions
What's Changed
- Bump float8 to 0.7.0 by @EricLBuehler in #527
- Bump libloading 0.9.0 by @chelsea0x3b in #528
Full Changelog: v0.19.0...v0.19.1
v0.19.0 - small updates
What's Changed
- Fix memory safety issue in CudaSlice::leak and optimize Drop by @wizenink in #516
- [Breaking] get_global returns CudaViewMut by @chelsea0x3b in #517
- Add fallback for loading like there is for version by @wingertge in #518
- fixes a few issues with multi gpu usage in both candle and mistralrs by @krampenschiesser in #520
- unify memcpy peer & memcpy dtod by @chelsea0x3b in #522
New Contributors
- @wingertge made their first contribution in #518
- @krampenschiesser made their first contribution in #520
Full Changelog: v0.18.2...v0.19.0
v0.18.2 - cuda 13.1, grouped gemms & more
What's Changed
- Add CUBIN binary format support to module loading by @SpenserCai in #505
- Allow accessing PTX image bytes by @TheNewJavaman in #502
- feat: cufft bindings by @mayocream in #500
- fix drop impls for cusolver by @chelsea0x3b in #506
- cublas Grouped gemm by @zackangelo in #508
- Add cuda-13010 by @chelsea0x3b in #507
New Contributors
- @SpenserCai made their first contribution in #505
- @TheNewJavaman made their first contribution in #502
- @mayocream made their first contribution in #500
Full Changelog: v0.18.1...v0.18.2
v0.18.1 - cublas fixes for cuda-13000
What's Changed
- More cublas filters for cuda-13000 by @chelsea0x3b in #496
Full Changelog: v0.18.0...v0.18.1
v0.18.0 - Fixing potential mutable aliasing
What's Changed
- Adding CudaContext::per_thread_stream() by @chelsea0x3b in #491
- adds more filters for cuda 13 by @chelsea0x3b in #492
- Deprecate memcpy_stod,memcpy_dtov. Adds clone_htod,clone_dtoh by @chelsea0x3b in #493
- Small docs updates by @chelsea0x3b in #494
- [Breaking] Fixing mutable aliasing soundness issue by @chelsea0x3b in #495
Full Changelog: v0.17.8...v0.18.0
v0.17.8 - unified memory, cutensor sys/result bindings
What's Changed
- Unified memory: slice methods. by @npatsakula in #485
- Using free_sync if ctx doesn't have async_alloc by @coreylowman in #486
- Add minimal cutensor bindings by @wizenink in #483
New Contributors
- @npatsakula made their first contribution in #485
Full Changelog: v0.17.7...v0.17.8
v0.17.7 `-F fallback-latest`, and attribute getters
What's Changed
- Expose a getter for compute capability by @wizenink in #477
- Add safe API for CUDA constant memory operations by @wizenink in #478
- Add safe API for querying CUDA function attributes by @wizenink in #479
- Add context configuration APIs by @wizenink in #481
- Adds fallback-latest feature flag by @coreylowman in #482
New Contributors
Full Changelog: v0.17.6...v0.17.7