You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: CHANGELOG.md
+44-3Lines changed: 44 additions & 3 deletions
Original file line number
Diff line number
Diff line change
@@ -4,20 +4,61 @@ All notable changes to this project will be documented in this file.
4
4
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
5
5
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
6
6
7
-
## [dev] - XXX. XX, XXXX
7
+
## [0.20.0] - XXX. XX, XXXX
8
8
9
9
### Added
10
10
11
+
* Added `dpctl.WorkGroupMemory` class representing `sycl::ext::oneapi::experimental::work_group_memory`, to be used as a kernel argument type [gh-1984](https://github.com/IntelPython/dpctl/pull/1984)
12
+
* Added `dpctl.LocalAccessor`class representing `sycl::local_accessor`, to be used as a kernel argument type [gh-1991](https://github.com/IntelPython/dpctl/pull/1991)
13
+
* Added `dpctl.SyclPlatform.get_devices` method for getting all `dpctl.SyclDevices` for the platform [gh-1992](https://github.com/IntelPython/dpctl/pull/1992)
14
+
* Added support for the composite devices extension for Level Zero devices, usable with some devices when setting `ZE_FLAT_DEVICE_HIERARCHY=COMBINED`[gh-1993](https://github.com/IntelPython/dpctl/pull/1993)
11
15
* Added `out` keyword to `tensor.take`[gh-2010](https://github.com/IntelPython/dpctl/pull/2010)
16
+
* Added `dpctl.RawKernelArg` class representing `sycl::ext::oneapi::experimental::raw_kernal_arg`, to be used as a kernel argument type [gh-2038](https://github.com/IntelPython/dpctl/pull/2038)
17
+
* Added `dpctl.SyclDevice` methods for querying, enabling, and disabling peer access between devices [gh-2077](https://github.com/IntelPython/dpctl/pull/2077), [gh-2082](https://github.com/IntelPython/dpctl/pull/2082)
12
18
13
19
### Changed
14
20
21
+
* Updated Level Zero loader detection to no longer rely on reading `libur_adapter_level_zero.so` for the loader filename [gh-2025](https://github.com/IntelPython/dpctl/pull/2025)
22
+
* Updated integer array indexing to align with the 2024.12 array API specification [gh-2032](https://github.com/IntelPython/dpctl/pull/2032)
15
23
* Support for Boolean data-type is added to `dpctl.tensor.ceil`, `dpctl.tensor.floor`, and `dpctl.tensor.trunc`[gh-2033](https://github.com/IntelPython/dpctl/pull/2033)
16
-
* Changed implementation of `DPCTLPlatform_GetDefaultContext` from using deprecated `ext_oneapi_get_default_context` to `khr_get_default_context`[#2042](https://github.com/IntelPython/dpctl/pull/2042)
17
-
* Updated `repr` to show the shape of the abbreviated arrays and show the shape and data type of zero-size arrays [#2067](https://github.com/IntelPython/dpctl/pull/2067)
24
+
* Changed implementation of `DPCTLPlatform_GetDefaultContext` from using deprecated `ext_oneapi_get_default_context` to `khr_get_default_context`[gh-2042](https://github.com/IntelPython/dpctl/pull/2042)
25
+
* Updated supported array API specification version to 2024.12 [gh-2047](https://github.com/IntelPython/dpctl/pull/2047)
26
+
* Implementation struct for `tensor.imag` now uses a static member value for the imaginary part of real-valued inputs [gh-2063](https://github.com/IntelPython/dpctl/pull/2063)
27
+
* Updated `repr` to show the shape of the abbreviated arrays and show the shape and data type of zero-size arrays [gh-2067](https://github.com/IntelPython/dpctl/pull/2067)
28
+
* Changed `tensor.__array_namespace_info__().capabilities()[]"max dimensions"]` to `None`[gh-2071](https://github.com/IntelPython/dpctl/pull/2071)
18
29
19
30
### Fixed
20
31
32
+
* Refactored code common to accumulation operations (`dpt.cumulative_sum`, `dpt.cumulative_prod`, `dpt.cumulative_logsumexp`) and removed unnecessary event initialization [gh-2011](https://github.com/IntelPython/dpctl/pull/2011)
33
+
* Fixed incorrect results for `dpt.cumulative_sum` and `dpt.cumulative_prod` when `dtype=dpt.bool`[gh-2018](https://github.com/IntelPython/dpctl/pull/2018)
34
+
* Fixed a typo in `dpctl.SyclPlatform` repr [gh-2035](https://github.com/IntelPython/dpctl/pull/2035)
35
+
* Fixed a bug in `tensor.asarray` where `order="K"` could fail to produce an array sufficient for the internal copy operation for some edge cases, including a contiguous array with permuted dimensions [gh-2058](https://github.com/IntelPython/dpctl/pull/2058)
36
+
* Fixed a typo in `dpctl.memory.USMAllocationError`[gh-2072](https://github.com/IntelPython/dpctl/pull/2072)
37
+
38
+
### Maintenance
39
+
40
+
* Document `dpctl.device_type`, `dpctl.backend_type`, `dpctl.event_status_type`, and `dpctl.global_mem_cache_type` enums [gh-2019](https://github.com/IntelPython/dpctl/pull/2019)
41
+
* Updated `SYCL_INCLUDE_DIR_HINT` in Conda recipe [gh-2039](https://github.com/IntelPython/dpctl/pull/2039)
42
+
* Updated expected dtypes in element-wise function docstrings [gh-2041](https://github.com/IntelPython/dpctl/pull/2041), [gh-2048](https://github.com/IntelPython/dpctl/pull/2048)
43
+
* Set `ARRAY_API_TESTS_VERSION=2024.12` when running array API conformity job in CI [gh-2046](https://github.com/IntelPython/dpctl/pull/2046)
44
+
* Install `hwloc` when running CI job for nightly SYCL compiler [gh-2050](https://github.com/IntelPython/dpctl/pull/2050)
45
+
* Added `cython-lint` to `pre-commit` to improve style and readability of Cython code [gh-2056](https://github.com/IntelPython/dpctl/pull/2056)
46
+
* Skip upload jobs when GitHub CI is called from a forked repo [gh-2059](https://github.com/IntelPython/dpctl/pull/2059)
47
+
* Disable nightly tests run from forked repos [gh-2060](https://github.com/IntelPython/dpctl/pull/2060)
48
+
* Fixed a typo in beginner's guide example [gh-2061](https://github.com/IntelPython/dpctl/pull/2061)
49
+
* Updated bandit version [gh-2075](https://github.com/IntelPython/dpctl/pull/2075)
This release features official, out-of-the-box support for compiling `dpctl` for specified AMD GPU architectures, the addition of new function `tensor.top_k`, a radix-sort-based implementation of sorting functions, and improvements to interoperability with DLPack through `tensor.dldevice_to_sycl_device` and `tensor.sycl_device_to_dldevice`.
0 commit comments