Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Poor default Tensile_COMPILER on Linux #1525

Open
trixirt opened this issue Dec 9, 2024 · 9 comments
Open

[Bug]: Poor default Tensile_COMPILER on Linux #1525

trixirt opened this issue Dec 9, 2024 · 9 comments
Assignees

Comments

@trixirt
Copy link
Contributor

trixirt commented Dec 9, 2024

This option sets the Tensile_COMPILER on Linux
https://github.com/ROCm/rocBLAS/blob/develop/cmake/build-options.cmake#L78

No checking is made if amdclang++ is in the PATH.
clang++ or hipcc or whatever the user passed in as the CMAKE_CXX_COMPILER would have been a better choice. On most Linux distros, there is no amdclang++.

Found on Fedora Rawhide while updating to 6.3

@TorreZuk
Copy link
Contributor

TorreZuk commented Dec 9, 2024

@trixirt for 6.2 it was hipcc without path verification. In both cases (hipcc and now amdclang++) an OS stock c++ compiler is not supported for hip device code. I agree we can consider compiler specification outside of Tensile_COMPILER variable, but detection should be done in the Tensile component. If the path pointed before to include hipcc it should also find amdclang++. Please provide your error messages as will discuss with Tensile how to work around your build issues. Provide your PATH variable and other relevant information.

@ellosel
Copy link
Contributor

ellosel commented Dec 9, 2024

We are moving away from hipcc. Once it is removed from tensile, the only compiler that we will support on linux is amdclang++. Perhaps I am missing a use case here so please provide more detail.

If ROCM_PATH is not /opt/rocm you are free to change that to wherever rocm is installed and Tensile will honor it.

@trixirt
Copy link
Contributor Author

trixirt commented Dec 9, 2024

On Fedora, the complier used for rocm is not called amdclang++, it is either hipcc or clang++. hipcc is in the normal PATH, so we were getting lucky in 6.2. In either case on 6.2 or 6.3, there is no checking that the default compiler is in the PATH. I am working around this by setting -DTensile_COMPILER= at configure time. I would expect the Tensile_COMPILER to be checked in the same manner as the CMAKE_CXX_COMPILER, failing a configure time, not during the build.

@ellosel
Copy link
Contributor

ellosel commented Dec 9, 2024

@trixirt can you provide a minimal docker file to reproduce your environment and the command you are running to build rocBLAS.

@TorreZuk
Copy link
Contributor

TorreZuk commented Dec 9, 2024

Well can Fedora add the amdclang++ as an symlink to their ROCm llvm toolchain to match supported toolchain status. If we want to pass along our rocblas CMAKE_CXX_COMPILER it would still need amdclang++ . We could consider even removing that tensile variable from the build command and leave Tensile to determine it's own via cmake vars and env overrides. It was required to be when we switched to amdclang++ in rocblas but Tensile was still stuck using deprecated hipcc. Is all hipcc removed now in Tensile? I thought that still had pending changes. Fedora likely needs to add the workaround as 6.3 is released, for 6.4 maybe we can change.

@trixirt
Copy link
Contributor Author

trixirt commented Dec 10, 2024

A better default for the Tensile_COMPILER would be the CMAKE_CXX_COMPILER.

hipcc is still shipped with 6.3, I would expect folks to use it until it is removed completely. When will this happen ?

@TorreZuk
Copy link
Contributor

rocBLAS for 6.3 should be built by default with amdclang++ as set in the toolchain-linux.cmake file. Using a different compiler with potentially different defaults is not advised.
hipcc was deprecated in 5.4.3 https://rocm.docs.amd.com/en/docs-5.4.3/release.html#rocm-5-4-3
I don't know the removal date, but while it can be used for compiling code until removed it is not recommended to be used for 6.3+ builds of rocBLAS. We can look at removing the Tensile_COMPILER variable or using CMAKE_CXX_COMPILER but please try and use amdclang tools for the 6.3 codebase to match other OS.

@IMbackK
Copy link

IMbackK commented Dec 19, 2024

Hmm only pearl hipcc was communicated as deprecated, together with all the other pearl tools like the old hipify implementation. As far as i am aware the deprecation of the c++ implementation at https://github.com/ROCm/llvm-project/tree/amd-staging/amd/hipcc has not been communicated publicly. The readme of hipcc also only mentions replaceing the deprecated pearl implementation.

If the new c++ implementation of hipcc is also intended to be deprecated this is something that should be better communicated.

@TorreZuk
Copy link
Contributor

Will see if any compiler people chime in here, so maybe only the perl version was officially deprecated. It is no longer the default for building rocBLAS with ROCm 6.3 and later. As the behaviours (e.g. default flags) are not guaranteed to match you may diverge from what is in our tested build.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants