Skip to content

ROCm-afar OpenMP target offload#137

Open
awnawab wants to merge 21 commits intoecmwf-ifs:mainfrom
awnawab:naan-openmp-offload-amd-rebase
Open

ROCm-afar OpenMP target offload#137
awnawab wants to merge 21 commits intoecmwf-ifs:mainfrom
awnawab:naan-openmp-offload-amd-rebase

Conversation

@awnawab
Copy link
Collaborator

@awnawab awnawab commented Feb 6, 2026

This PR contributes an OpenMP target offload backend for AMD's ROCm-afar compiler.

All tests pass, other than tests/sync_device.F90. We hit a runtime problem when trying to access FIELD%DEVPTR on device, which seems linked to trying to access an abstract type on device. This is not an access pattern we use in our transformations, so for now I have just marked this as expected to fail for offloaded rocm-afar builds, and added the tests/copy_struct.F90 test which is more representative of the code we generate with loki. OWNER_GET_DEVICE_DATA had a similar problem during the on-device initialisation, and there I've used an associate block to work around it.

The big thing that is missing here is multi-precision builds with flang. I've understood the problem, and it's to do with flang being much stricter about module imports than other compilers. To regain that ability with flang, we have to make the "core" of the build precision independent again. This means removing INIT_DEBUG_VALUE_JPRB from field_defaults_module.F90. I understand this requires a change in IAL, so for now I've setup the CI to only build SP for flang. I will create an issue to track this and assign it to you @dareg if that's ok. It would be great to also support multi-precision builds in flang.

In a future PR I will also contribute an hpc-ci entry for running field_api on LUMI so that we can actually test it on AMD GPUs too.

@awnawab awnawab added the approved-for-ci Approved to run hpc-ci label Feb 6, 2026
@awnawab awnawab force-pushed the naan-openmp-offload-amd-rebase branch from 3cfa25f to d60ba3d Compare February 6, 2026 17:35
@github-actions github-actions bot removed the approved-for-ci Approved to run hpc-ci label Feb 6, 2026
@awnawab awnawab added the approved-for-ci Approved to run hpc-ci label Feb 6, 2026
@awnawab
Copy link
Collaborator Author

awnawab commented Feb 17, 2026

Hi @dareg, @pmarguinaud,

Have you had the chance to take a look at this PR yet? A big PR like this will conflict with many other bugfixes/developments, so it would be better to merge this sooner rather than later.

@dareg
Copy link
Collaborator

dareg commented Feb 23, 2026

I see no problem in validating this PR, I've tested it on our systems (so without AMD compiler neither GPU) and it doesn’t seems to break anything.

But I just got access (today) to a system with AMD card and I'm trying to compile on it.
I get a problem in ecbuild because it doesn't recognise the compiler and then cannot load the right option (nothing complicated, see below). So I was wondering which compiler did you use to try this PR on?

CMake Error at build/ecbuild/cmake/ecbuild_log.cmake:190 (message):
  CRITICAL - Variable 'ECBUILD_Fortran_COMPILE_OPTIONS_REAL4' must be
  defined for compiler with ID LLVMFlang.

   Description:
     Compile options to convert all unqualified reals to 32 bit (single precision)
   Please submit a patch. In the mean time you can provide the variable to the CMake configuration.

@awnawab
Copy link
Collaborator Author

awnawab commented Feb 23, 2026

I see no problem in validating this PR, I've tested it on our systems (so without AMD compiler neither GPU) and it doesn’t seems to break anything.

But I just got access (today) to a system with AMD card and I'm trying to compile on it. I get a problem in ecbuild because it doesn't recognise the compiler and then cannot load the right option (nothing complicated, see below). So I was wondering which compiler did you use to try this PR on?

CMake Error at build/ecbuild/cmake/ecbuild_log.cmake:190 (message):
  CRITICAL - Variable 'ECBUILD_Fortran_COMPILE_OPTIONS_REAL4' must be
  defined for compiler with ID LLVMFlang.

   Description:
     Compile options to convert all unqualified reals to 32 bit (single precision)
   Please submit a patch. In the mean time you can provide the variable to the CMake configuration.

I'll look into that error and find a proper fix for it. for now you can just downgrade to ecbuild 3.9.0 which doesn't have that error message.

@awnawab
Copy link
Collaborator Author

awnawab commented Feb 23, 2026

I see no problem in validating this PR, I've tested it on our systems (so without AMD compiler neither GPU) and it doesn’t seems to break anything.

But I just got access (today) to a system with AMD card and I'm trying to compile on it. I get a problem in ecbuild because it doesn't recognise the compiler and then cannot load the right option (nothing complicated, see below). So I was wondering which compiler did you use to try this PR on?

CMake Error at build/ecbuild/cmake/ecbuild_log.cmake:190 (message):
  CRITICAL - Variable 'ECBUILD_Fortran_COMPILE_OPTIONS_REAL4' must be
  defined for compiler with ID LLVMFlang.

   Description:
     Compile options to convert all unqualified reals to 32 bit (single precision)
   Please submit a patch. In the mean time you can provide the variable to the CMake configuration.

In fact this problem is fixed again in ecbuild 3.12, so it's better to use that. Can I push an update here to set the minimum required ecbuild version to 3.12?

@dareg
Copy link
Collaborator

dareg commented Feb 24, 2026

Yes sure, better to set it to the right version of ecbuild

@github-actions github-actions bot removed the approved-for-ci Approved to run hpc-ci label Feb 24, 2026
@awnawab awnawab added the approved-for-ci Approved to run hpc-ci label Feb 24, 2026
@awnawab
Copy link
Collaborator Author

awnawab commented Feb 24, 2026

The tests are failing because of a fiat bug, which should be fixed once ecmwf-ifs/fiat#96 is merged.

Copy link
Collaborator

@mlange05 mlange05 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent work, looks very good to me. GTG from my side. 👍

@awnawab
Copy link
Collaborator Author

awnawab commented Feb 24, 2026

@dareg were you able to test this successfully with an AMD GPU?

DESCRIPTION "Enable GPU offload via OpenMP"
CONDITION CMAKE_Fortran_COMPILER_ID MATCHES "PGI|NVHPC" AND ${_HAVE_OMP_OFFLOAD} )
CONDITION
(CMAKE_Fortran_COMPILER_ID MATCHES "PGI|NVHPC" OR CMAKE_Fortran_COMPILER MATCHES "amdflang")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you are missing the _ID at the end of CMAKE_Fortran_COMPILER

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah no, I misunderstood. CMAKE_Fortran_COMPILER is amdflang, and the CMAKE_Fortran_COMPILER_ID is LLVMFlang

else()
set(FIELD_API_OFFLOAD_MODEL "NVHPCOpenMP")
endif()
elseif( CMAKE_Fortran_COMPILER MATCHES "amdflang")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, I think you are missing the _ID at the end of CMAKE_Fortran_COMPILER

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved-for-ci Approved to run hpc-ci contributor

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants