forked from triton-lang/triton
-
Notifications
You must be signed in to change notification settings - Fork 40
Open
Description
Problem Description
When trying to run tune_gemm on a gfx950 using Rocm 7.0, I get the following error:
# ./tune_gemm.py --gemm_size_file t.yaml
Tuning 1 gemm sizes starts at: 2025-10-06 20:43:27.298307
SIZE: 8192 8192 8192 NN nConfigs: 1152 multiprocessing.pool.RemoteTraceback:
"""
Traceback (most recent call last):
File "/usr/lib/python3.12/multiprocessing/pool.py", line 125, in worker
result = (True, func(*args, **kwds))
^^^^^^^^^^^^^^^^^^^
File "/dockerx/triton/python/perf-kernels/tools/tune_gemm/./tune_gemm.py", line 184, in extract_kernel_time
first_value = df['DurationNs'].iloc[0]
~~~~~~~~~~~~~~~~~~~~~^^^
File "/opt/venv/lib/python3.12/site-packages/pandas/core/indexing.py", line 1191, in __getitem__
return self._getitem_axis(maybe_callable, axis=axis)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/venv/lib/python3.12/site-packages/pandas/core/indexing.py", line 1752, in _getitem_axis
self._validate_integer(key, axis)
File "/opt/venv/lib/python3.12/site-packages/pandas/core/indexing.py", line 1685, in _validate_integer
raise IndexError("single positional indexer is out-of-bounds")
IndexError: single positional indexer is out-of-bounds
"""
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/dockerx/triton/python/perf-kernels/tools/tune_gemm/./tune_gemm.py", line 725, in <module>
sys.exit(main())
^^^^^^
File "/dockerx/triton/python/perf-kernels/tools/tune_gemm/./tune_gemm.py", line 655, in main
minTime, bestConfig, compile_time, profile_time, post_time = tune_gemm_config(
^^^^^^^^^^^^^^^^^
File "/dockerx/triton/python/perf-kernels/tools/tune_gemm/./tune_gemm.py", line 264, in tune_gemm_config
config, myTime = task.get()
^^^^^^^^^^
File "/usr/lib/python3.12/multiprocessing/pool.py", line 774, in get
raise self._value
IndexError: single positional indexer is out-of-boundsIt does run correctly with the following workaround applied:
diff --git a/python/perf-kernels/tools/tune_gemm/tune_gemm.py b/python/perf-kernels/tools/tune_gemm/tune_gemm.py
index 7d51575e4..4ae63f04e 100755
--- a/python/perf-kernels/tools/tune_gemm/tune_gemm.py
+++ b/python/perf-kernels/tools/tune_gemm/tune_gemm.py
@@ -64,7 +64,7 @@ def get_full_tuning_space():
num_stage_range = [2]
waves_per_eu_range = [0]
matrix_instr_nonkdim_range = [16, 32]
- kpack_range = [1, 2]
+ kpack_range = [1]
schedule_hints = ["none"]
space = itertools.product(block_mn_range, block_mn_range, block_k_range, num_warps_range, group_m_range,Operating System
Ubuntu 24.04.3 LTS
CPU
AMD Ryzen Threadripper PRO 7985WX 64-Cores
GPU
AMD Instinct MI350X
ROCm Version
7.0
ROCm Component
No response
Steps to Reproduce
No response
(Optional for Linux users) Output of /opt/rocm/bin/rocminfo --support
No response
Additional Information
No response
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels