Skip to content

Fix Kineto+PTI profiling on BMG #4244

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 19 commits into from
May 21, 2025
Merged

Fix Kineto+PTI profiling on BMG #4244

merged 19 commits into from
May 21, 2025

Conversation

anmyachev
Copy link
Contributor

@anmyachev anmyachev commented May 19, 2025

anmyachev added 6 commits May 19, 2025 20:50
Signed-off-by: Anatoly Myachev <[email protected]>
Signed-off-by: Anatoly Myachev <[email protected]>
Signed-off-by: Anatoly Myachev <[email protected]>
Signed-off-by: Anatoly Myachev <[email protected]>
@anmyachev
Copy link
Contributor Author

FYI @etiotto @whitneywhtsang gemm benchmark with tensor of pointer doesn't work correctly on bmg https://github.com/intel/intel-xpu-backend-for-triton/actions/runs/15122978565/job/42509360103

@anmyachev anmyachev changed the title [DEBUG] Try intel-pti 0.12.2 for benchmarks Fix Kineto+PTI profiling on BMG May 20, 2025
@anmyachev anmyachev linked an issue May 20, 2025 that may be closed by this pull request
Signed-off-by: Anatoly Myachev <[email protected]>
@@ -141,7 +142,10 @@ jobs:
python build_report.py $REPORTS/matmul-performance.csv $REPORTS/gemm-triton-report.csv --benchmark gemm-legacy --compiler triton --param_cols "B,M,K,N" --tflops_col Triton-TFlops --hbm_col "Triton-GB/s" --tag $TAG
python build_report.py $REPORTS/matmul-performance.csv $REPORTS/gemm-xetla-report.csv --benchmark gemm-legacy --compiler xetla --param_cols "B,M,K,N" --tflops_col XeTLA-TFlops --hbm_col "XeTLA-GB/s" --tag $TAG
python build_report.py $REPORTS/matmul-performance.csv $REPORTS/gemm-onednn-report.csv --benchmark gemm-legacy --compiler onednn --param_cols "B,M,K,N" --tflops_col OneDNN-TFlops --hbm_col "OneDNN-GB/s" --tag $TAG
python build_report.py $REPORTS/matmul-performance.csv $REPORTS/gemm-cutlass-report.csv --benchmark gemm-legacy --compiler cutlass --param_cols "B,M,K,N" --tflops_col CUTLASS-TFlops --hbm_col "CUTLASS-GB/s" --tag $TAG
if [[ "${{ inputs.runner_label }}" = "max1550" ]]; then
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tracked through #4254.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @sommerlukas!

anmyachev added 3 commits May 20, 2025 21:19
This reverts commit 2a6ca23.
Signed-off-by: Anatoly Myachev <[email protected]>
@anmyachev anmyachev marked this pull request as ready for review May 21, 2025 10:32
@anmyachev anmyachev force-pushed the amyachev/issue4172 branch from 0356811 to e03e521 Compare May 21, 2025 13:25
@sommerlukas sommerlukas removed their request for review May 21, 2025 13:31
@sommerlukas sommerlukas requested a review from jle-quel May 21, 2025 13:31
Signed-off-by: Anatoly Myachev <[email protected]>
@@ -141,7 +148,10 @@ jobs:
source ../../scripts/capture-hw-details.sh
python build_report.py $REPORTS/matmul-performance-base.csv $REPORTS/gemm-newshapes-triton-report.csv --benchmark gemm --compiler triton --param_cols "B,M,K,N" --tflops_col Triton-TFlops --hbm_col "Triton-GB/s" --tag $TAG
python build_report.py $REPORTS/matmul-performance-base.csv $REPORTS/gemm-newshapes-onednn-report.csv --benchmark gemm --compiler onednn --param_cols "B,M,K,N" --tflops_col OneDNN-TFlops --hbm_col "OneDNN-GB/s" --tag $TAG
python build_report.py $REPORTS/matmul-performance-base.csv $REPORTS/gemm-newshapes-cutlass-report.csv --benchmark gemm --compiler cutlass --param_cols "B,M,K,N" --tflops_col CUTLASS-TFlops --hbm_col "CUTLASS-GB/s" --tag $TAG
if [[ "${{ inputs.runner_label }}" = "max1550" ]]; then
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that inputs.runner_label is not set by default, most like the condition is not met on max1550 (please double check the last run). Potentially you need something like this: ${{ inputs.runner_label || 'max1550' }}.

@@ -74,7 +76,7 @@ jobs:
timeout-minutes: 720
defaults:
run:
shell: bash -noprofile --norc -eo pipefail -c "source /opt/intel/oneapi/setvars.sh > /dev/null; source {0}"
shell: bash -noprofile --norc -eo pipefail -c "source /opt/intel/oneapi/setvars.sh > /dev/null; export LD_LIBRARY_PATH=$PTI_LIBS_DIR:$LD_LIBRARY_PATH; source {0}"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pbchekin do you have a suggestion how to fix it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I decided to return it as it was, with duplication, but at least it works

Copy link
Contributor

@pbchekin pbchekin May 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just an idea: we can add a new step in the very beginning (after installing python and intel-pti) that does not use default shell (so this code is not executed). In this step, create a file, for example, ~/.env with

PTI_LIBS_DIR=...
source /opt/intel/oneapi/setvars.sh > /dev/null
export LD_LIBRARY_PATH=$PTI_LIBS_DIR:$LD_LIBRARY_PATH;

Then the default shell can be changed to

shell: bash -noprofile --norc -eo pipefail -c "[[ -f ~/.env ]] && source ~/.env; source {0}"

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If adjusting LD_LIBRARY_PATH is safer in each step, then my suggestion is to create a file and source it in each step

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If adjusting LD_LIBRARY_PATH is safer in each step, then my suggestion is to create a file and source it in each step

NVM, i see in the last commit you have only one additional line per step, this looks good.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW, thanks for the idea!

@anmyachev anmyachev merged commit 3f3bcf3 into main May 21, 2025
16 of 17 checks passed
@anmyachev anmyachev deleted the amyachev/issue4172 branch May 21, 2025 16:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[benchmarks][BMG] The profiling numbers don't match
4 participants