Conversation
…bs because of leaking imports
3cfa25f to
d60ba3d
Compare
|
Hi @dareg, @pmarguinaud, Have you had the chance to take a look at this PR yet? A big PR like this will conflict with many other bugfixes/developments, so it would be better to merge this sooner rather than later. |
|
I see no problem in validating this PR, I've tested it on our systems (so without AMD compiler neither GPU) and it doesn’t seems to break anything. But I just got access (today) to a system with AMD card and I'm trying to compile on it. |
I'll look into that error and find a proper fix for it. for now you can just downgrade to ecbuild 3.9.0 which doesn't have that error message. |
In fact this problem is fixed again in ecbuild 3.12, so it's better to use that. Can I push an update here to set the minimum required ecbuild version to 3.12? |
|
Yes sure, better to set it to the right version of ecbuild |
|
The tests are failing because of a fiat bug, which should be fixed once ecmwf-ifs/fiat#96 is merged. |
mlange05
left a comment
There was a problem hiding this comment.
Excellent work, looks very good to me. GTG from my side. 👍
|
@dareg were you able to test this successfully with an AMD GPU? |
| DESCRIPTION "Enable GPU offload via OpenMP" | ||
| CONDITION CMAKE_Fortran_COMPILER_ID MATCHES "PGI|NVHPC" AND ${_HAVE_OMP_OFFLOAD} ) | ||
| CONDITION | ||
| (CMAKE_Fortran_COMPILER_ID MATCHES "PGI|NVHPC" OR CMAKE_Fortran_COMPILER MATCHES "amdflang") |
There was a problem hiding this comment.
I think you are missing the _ID at the end of CMAKE_Fortran_COMPILER
There was a problem hiding this comment.
Ah no, I misunderstood. CMAKE_Fortran_COMPILER is amdflang, and the CMAKE_Fortran_COMPILER_ID is LLVMFlang
| else() | ||
| set(FIELD_API_OFFLOAD_MODEL "NVHPCOpenMP") | ||
| endif() | ||
| elseif( CMAKE_Fortran_COMPILER MATCHES "amdflang") |
There was a problem hiding this comment.
Same here, I think you are missing the _ID at the end of CMAKE_Fortran_COMPILER
This PR contributes an OpenMP target offload backend for AMD's ROCm-afar compiler.
All tests pass, other than
tests/sync_device.F90. We hit a runtime problem when trying to accessFIELD%DEVPTRon device, which seems linked to trying to access an abstract type on device. This is not an access pattern we use in our transformations, so for now I have just marked this as expected to fail for offloaded rocm-afar builds, and added thetests/copy_struct.F90test which is more representative of the code we generate with loki.OWNER_GET_DEVICE_DATAhad a similar problem during the on-device initialisation, and there I've used an associate block to work around it.The big thing that is missing here is multi-precision builds with flang. I've understood the problem, and it's to do with flang being much stricter about module imports than other compilers. To regain that ability with flang, we have to make the "core" of the build precision independent again. This means removing
INIT_DEBUG_VALUE_JPRBfrom field_defaults_module.F90. I understand this requires a change in IAL, so for now I've setup the CI to only build SP for flang. I will create an issue to track this and assign it to you @dareg if that's ok. It would be great to also support multi-precision builds in flang.In a future PR I will also contribute an hpc-ci entry for running field_api on LUMI so that we can actually test it on AMD GPUs too.