Skip to content

Add mkl::linalg_svd relative Ops #1511

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 21 commits into
base: main
Choose a base branch
from
Open

Add mkl::linalg_svd relative Ops #1511

wants to merge 21 commits into from

Conversation

yucai-intel
Copy link
Contributor

@yucai-intel yucai-intel commented Mar 26, 2025

This pull request introduces support for Singular Value Decomposition (SVD) on XPU devices using oneMKL. It includes the implementation of SVD kernels, necessary utility functions, and test updates. The following ops are enabled on XPU devices:

  • _linalg_svd
  • _linalg_svd.U
  • linalg_svd
  • linalg_svd.U

@yucai-intel
Copy link
Contributor Author

yucai-intel commented Mar 27, 2025

Test cases related to complex data types fail because onednn does not support it.
image
image

@yucai-intel yucai-intel changed the title [WIP] Add mkl::linalg_svd relative Ops Add mkl::linalg_svd relative Ops Apr 9, 2025
@yucai-intel yucai-intel requested a review from CuiYifeng April 11, 2025 01:53
Copy link
Contributor

@CuiYifeng CuiYifeng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please reactivate some skipped cases such as https://github.com/intel/torch-xpu-ops/blob/main/test/xpu/skip_list_common.py#L1397. Otherwise, CI cannot test with these svd cases.

@CuiYifeng CuiYifeng force-pushed the slogdet branch 4 times, most recently from 56cebb3 to 9ac4024 Compare April 25, 2025 08:54
@CuiYifeng CuiYifeng force-pushed the slogdet branch 2 times, most recently from e13dc4f to da85fe6 Compare May 7, 2025 05:27
@CuiYifeng CuiYifeng requested a review from Copilot May 13, 2025 13:11
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds support for MKL-based SVD operations on the XPU backend, including new composite functions and updates to native function definitions for linalg_svd and related ops. Key changes include:

  • New YAML definitions for _linalg_svd and linalg_svd (and their variants) exposing the compute_uv flag.
  • Removal of several SVD-related skip tests in the xpu test skip list.
  • New MKL-based implementation files for BatchLinearAlgebra (header and source) and updated XPU dispatch registration.

Reviewed Changes

Copilot reviewed 5 out of 6 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
yaml/native/native_functions.yaml Added definitions for new SVD ops and composite functions.
test/xpu/skip_list_common.py Removed skip tests for SVD float64 variants on XPU.
src/ATen/native/xpu/mkl/BatchLinearAlgebra.h Declared the svd_mkl prototype.
src/ATen/native/xpu/mkl/BatchLinearAlgebra.cpp Implemented the svd_mkl function using oneMKL calls.
src/ATen/native/xpu/BatchLinearAlgebra.cpp Registered the XPU dispatch for the SVD kernel.
Files not reviewed (1)
  • src/ATen/native/xpu/XPUFallback.template: Language not supported

@CuiYifeng
Copy link
Contributor

@fengyuan14 @majing921201 Please help review, thanks.

#if defined(USE_ONEMKL)
native::xpu::svd_mkl(A, full_matrices, compute_uv, driver, U, S, Vh, info);
#else
const auto A_cpu = A.to(A.options()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why we need to fallback cpu when there is no mkl install, instead of throw an error. If cpu path fails and throw error in cpu side, that will confuse user.

Copy link
Contributor

@CuiYifeng CuiYifeng May 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@majing921201 There still exists an op registration on XPU device when USE_ONEMKL is OFF. Please note that USE_ONEMKL will be renamed as USE_ONEMKL_XPU #1642 , which is irrelevant to USE_MKL in stock Pytorch.
Before oneMKL XPU build is default ON, such a fallback path is needed, in case functionality is broken.

@@ -0,0 +1,352 @@
#if defined(USE_ONEMKL)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When the scope is the whole source file, please change the logic in CMAKE, to filter the file if USE_ONEMKL is off.

Copy link
Contributor

@CuiYifeng CuiYifeng May 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. src/ATen/native/xpu/mkl/*.cpp is filtered in src/ATen/CMakeLists.txt when oneMKL XPU support is OFF.

const Tensor& U,
const Tensor& S,
const Tensor& Vh,
const Tensor& info) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you align this op implementation with svd_stub logic in stock pytorch. Previous code in ipex is based on a legacy impl in Pytorch 1.x. For our current upstream solution, those copy and resize etc. should already be covered by common code in stock pytorch now. please check.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to remove those resize and copy but the current implementation will be broken. My suggestion is to ensure functionality on XPU first and refine these code in another PR.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@CuiYifeng , any justification regarding these operations that we need to ensure the functionality first? Any models that depend on these operations? If not, why don't we polish this PR to make it good enough?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SVD was requested by Argonne. Get your point.

@CuiYifeng CuiYifeng requested a review from Copilot May 21, 2025 05:17
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This pull request adds support for Singular Value Decomposition (SVD) on XPU devices using oneMKL by introducing new op definitions, implementing corresponding MKL-based kernels, and updating test skip lists.

  • Updated YAML configuration to expose SVD ops (_linalg_svd, linalg_svd, and their U variants).
  • Added new SVD kernel implementations and supporting utility functions in the xpu/mkl directories.
  • Adjusted test skip lists to reflect the updated SVD support on XPU float64 inputs.

Reviewed Changes

Copilot reviewed 6 out of 9 changed files in this pull request and generated no comments.

Show a summary per file
File Description
yaml/native/native_functions.yaml Adds new op definitions for SVD and its variants on XPU devices.
test/xpu/skip_list_common.py Removes several float64-related SVD test skips to enable testing on XPU devices.
src/ATen/native/xpu/mkl/SpectralOps.cpp Removes conditional compilation guard for oneMKL usage.
src/ATen/native/xpu/mkl/BatchLinearAlgebra.h Introduces the declaration for the new svd_mkl function.
src/ATen/native/xpu/mkl/BatchLinearAlgebra.cpp Implements the MKL-based SVD functionality with type specializations.
src/ATen/native/xpu/BatchLinearAlgebra.cpp Provides an XPU dispatch wrapper that calls the new oneMKL-based implementation.
Files not reviewed (3)
  • cmake/ONEMKL.cmake: Language not supported
  • src/ATen/CMakeLists.txt: Language not supported
  • src/ATen/native/xpu/XPUFallback.template: Language not supported
Comments suppressed due to low confidence (1)

src/ATen/native/xpu/mkl/SpectralOps.cpp:1

  • The removal of the conditional compilation guard for USE_ONEMKL in this file may impact systems without oneMKL. Please verify that this change is intentional and that necessary build guards are applied elsewhere if needed.
#include <ATen/native/Resize.h>

@xytintel xytintel requested a review from CuiYifeng May 21, 2025 05:26
@CuiYifeng CuiYifeng requested a review from fengyuan14 May 21, 2025 05:36
@chuanqi129 chuanqi129 removed the xpu-op label May 21, 2025
@EikanWang
Copy link
Contributor

Pls. address @majing921201 's comments.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants