Rocm jaxlib v0.5.0 slowdown 3 #127

zoranjovanovic-ns · 2025-03-11T16:14:09Z

Just for testing purposes

This reverts commit 25a5db7.

…th rocm Imported from GitHub PR openxla#22334 This change fixes the flaky gpu compiler test used to run on rocm CI pipeline gate. Triton pipeline was wrongly using the TritonGPUAccelerateMatmul pass which supports cuda only. In rocm there is a different pass which is now used in the rocm pipeline. https://github.com/triton-lang/triton/blob/main/third_party/amd/lib/TritonAMDGPUTransforms/AccelerateAMDMatmul.cpp Copybara import of the project: -- c5f600f by Alexandros Theodoridis <[email protected]>: Fix flaky gpu compiler test when building with rocm Merging this change closes openxla#22334 COPYBARA_INTEGRATE_REVIEW=openxla#22334 from ROCm:rocm_fix_flaky_gpu_compiler_test c5f600f PiperOrigin-RevId: 723469960

draganmladjenovic and others added 4 commits March 11, 2025 08:59

[ROCm] Apply precise block size metadata

204392a

Fixes

bd68d4e

[ROCm] Pass correct warp size to Triton pipeline

c735ba6

Revert "added xla_enable_layout_assignment flag"

f9f68c2

This reverts commit 25a5db7.

zoranjovanovic-ns self-assigned this Mar 11, 2025

Provide feedback