Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Zeros along edges of regression test output #2589

Open
kristinbarton opened this issue Feb 4, 2025 · 9 comments · May be fixed by NOAA-EMC/fv3atm#929 or #2595
Open

Zeros along edges of regression test output #2589

kristinbarton opened this issue Feb 4, 2025 · 9 comments · May be fixed by NOAA-EMC/fv3atm#929 or #2595
Assignees
Labels
bug Something isn't working

Comments

@kristinbarton
Copy link

Description

I have been working on developing a regional Arctic domain based off existing HAFS regression tests in the UFS weather model. I noticed that when using cubed_sphere_grid output, the atmf00*.nc files contain anomalous zeros along the edges of the domain. Looking back through the regression tests, I found that these zeros are also present in the hafs_regional_atm_thompson_gfdlsf regression test.

To Reproduce:

The error can be found in the baseline dataset for hafs_regional_atm_thompson_gfdlsf regression test. I have been working on Hera and found the zeros were still present as recently as the 2025-01-29 baseline files (/scratch2/NAGAPE/epic/UFS-WM_RT/NEMSfv3gfs/develop-20250129/hafs_regional_atm_thompson_gfdlsf_intel/)

Output

The anomolous zeros are all along the edges of the domain. For example, below is quick output showing where the tmp variable is equal to 0 (purple) and not 0 (yellow). This is taken from the hafs_regional_atm_thompson_gfdlsf regression test data. The output is plotting the time=0 and pfull=0 layer, but the distribution of 0s appears to be the same for every layer and for every variable in the dataset.

Image

@kristinbarton kristinbarton added the bug Something isn't working label Feb 4, 2025
@NickSzapiro-NOAA
Copy link
Collaborator

Given the output type and test, is this related to #2591 @DusanJovic-NOAA ?

@DusanJovic-NOAA
Copy link
Collaborator

Given the output type and test, is this related to #2591 @DusanJovic-NOAA ?

PR 2591 only changes how data are written, it does not change any data. The regression test reproduces current outputs. I'll take a look at hafs_regional_atm_thompson_gfdlsf regression test.

@DusanJovic-NOAA
Copy link
Collaborator

The fact that not all points along the boundary rows/columns are zero indicates that the source of the erroneous zero values are due to esmf remapping. Probably because of slight differences in the lat/lon values between the forecast component and write component. Interestingly, this test (hafs_regional_atm_thompson_gfdlsf) is the only test that outputs regional native grid outputs (i.e. cubed sphere) of all hafs tests. And it's compiled with -D32BIT=ON, which means the lat/lon values of the esmf grid on the forecast component are 32-bit values, while the esmf grid on the write grid component are internally 64-bit values (by default).

Just to test this hypothesis, I compiled the model using 64-bit (without -D32BIT=ON cmake option) and ran the hafs_regional_atm_thompson_gfdlsf test, for example by using this rt.conf:

### HAFS tests ###
COMPILE | hafs | intel | -DAPP=HAFS -DCCPP_SUITES=FV3_HAFS_v1_thompson_tedmf_gfdlsf |  | fv3 |
RUN | hafs_regional_atm_thompson_gfdlsf                 |                              | baseline |

and I do not see zeros along the boundary.

@kristinbarton Can you please recompile your code without -D32BIT=ON and rerun your test.

If this works, then the proper solution will be either to truncate the esmf grid coordinates to single precision when 32BIT is ON, or maybe to use esmf redist instead of esmf remap.

@kristinbarton
Copy link
Author

Thank you for looking into this.

Our regional Arctic domain also used -D32BIT=ON, so I recompiled both our case and the regression test using 64-bit. Both cases successfully generated output with no zeros.

@gspetro-NOAA
Copy link
Collaborator

@kristinbarton Are you planning to troubleshoot this issue and open a PR, or are you looking for one of our teams to assist with this?
@BinLiu-NOAA do you have any suggestions for Kristin?

@kristinbarton
Copy link
Author

@gspetro-NOAA For our purposes, using the 64-bit compilation is satisfactory.

@DusanJovic-NOAA
Copy link
Collaborator

I'm testing a code change that will fix this issue with -D32BIT=ON.

@DusanJovic-NOAA
Copy link
Collaborator

@kristinbarton Please try to use the code from this fv3atm branch (https://github.com/DusanJovic-NOAA/fv3atm/tree/cubed_sphere_redist) and corresponding ufs-wm branch (https://github.com/DusanJovic-NOAA/ufs-weather-model/tree/cubed_sphere_redist) to run your regional Arctic domain configuration but with -D32BIT=ON, to see if the issue is fixed.

If it is, I will open a PR.

@kristinbarton
Copy link
Author

@DusanJovic-NOAA It looks like this did fix the issue. I tested it on our Arctic configuration with 32-bit on and there were no zeros in the output.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment