Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Build fails with LTO #37

Open
eli-schwartz opened this issue Mar 31, 2024 · 12 comments
Open

Build fails with LTO #37

eli-schwartz opened this issue Mar 31, 2024 · 12 comments

Comments

@eli-schwartz
Copy link

eli-schwartz commented Mar 31, 2024

I tried to build with the following *FLAGS to optimize the build: -flto=4 -Werror=odr -Werror=lto-type-mismatch -Werror=strict-aliasing

Note the -Werror=* flags are used to help detect cases where the compiler tries to optimize by assuming UB cannot exist in the source code -- if it does exist, ordinarily the code would be miscompiled, and this says to make the miscompilation a fatal error.

I got this error:

/bin/sh ../libtool  --tag=CC   --mode=link x86_64-pc-linux-gnu-cc  -march=native -fstack-protector-all -O2 -pipe -fdiagnostics-color=always -frecord-gcc-switches -flto=4 -Werror=odr -Werror=lto-type-mismatch -Werror=strict-aliasing  -Wformat -Werror=format-security -Werror=implicit-function-declaration -Werror=implicit-int -Werror=int-conversion -Werror=incompatible-pointer-types -fopenmp  -Wl,-O1 -Wl,--as-needed -flto=4 -Werror=odr -Werror=lto-type-mismatch -Werror=strict-aliasing -Wl,--defsym=__gentoo_check_ldflags__=0 -fopenmp -o liblis.la -rpath /usr/lib64  array/libarray.la esolver/libesolver.la matrix/libmatrix.la matvec/libmatvec.la precision/libprecision.la precon/libprecon.la solver/libsolver.la system/libsystem.la vector/libvector.la  fortran/libfortran.la -L/usr/lib/gcc/x86_64-pc-linux-gnu/13 -L/usr/lib/gcc/x86_64-pc-linux-gnu/13/../../../../lib64 -L/lib/../lib64 -L/usr/lib/../lib64 -L/usr/lib/gcc/x86_64-pc-linux-gnu/13/../../../../x86_64-pc-linux-gnu/lib -L/usr/lib/gcc/x86_64-pc-linux-gnu/13/../../.. -lgfortran -lm -lquadmath -lm 
libtool: link: x86_64-pc-linux-gnu-cc -shared  -fPIC -DPIC  -Wl,--whole-archive array/.libs/libarray.a esolver/.libs/libesolver.a matrix/.libs/libmatrix.a matvec/.libs/libmatvec.a precision/.libs/libprecision.a precon/.libs/libprecon.a solver/.libs/libsolver.a system/.libs/libsystem.a vector/.libs/libvector.a fortran/.libs/libfortran.a -Wl,--no-whole-archive  -Wl,--as-needed -L/usr/lib/gcc/x86_64-pc-linux-gnu/13 -L/usr/lib/gcc/x86_64-pc-linux-gnu/13/../../../../lib64 -L/lib/../lib64 -L/usr/lib/../lib64 -L/usr/lib/gcc/x86_64-pc-linux-gnu/13/../../../../x86_64-pc-linux-gnu/lib -L/usr/lib/gcc/x86_64-pc-linux-gnu/13/../../.. -lgfortran /usr/lib/gcc/x86_64-pc-linux-gnu/13/libquadmath.so -lm  -march=native -fstack-protector-all -O2 -fdiagnostics-color=always -frecord-gcc-switches -flto=4 -Werror=odr -Werror=lto-type-mismatch -Werror=strict-aliasing -Werror=format-security -Werror=implicit-function-declaration -Werror=implicit-int -Werror=int-conversion -Werror=incompatible-pointer-types -fopenmp -Wl,-O1 -flto=4 -Werror=odr -Werror=lto-type-mismatch -Werror=strict-aliasing -Wl,--defsym=__gentoo_check_ldflags__=0 -fopenmp   -fopenmp -Wl,-soname -Wl,liblis.so.0 -o .libs/liblis.so.0.0.0
fortran/lisf_init.F:43:72: error: type of 'lis_set_argv_f' does not match original declaration [-Werror=lto-type-mismatch]
   43 |         call lis_set_argv(i,argv,ierr)
      |                                                                        ^
fortran/lisf_system.c:158:6: note: type mismatch in parameter 4
  158 | void lis_set_argv_f(LIS_INT *no, char *argv, LIS_INT *ierr, LIS_INT len)
      |      ^
fortran/lisf_system.c:158:6: note: type 'LIS_INT' should match type 'long int'
fortran/lisf_system.c:158:6: note: 'lis_set_argv_f_' was previously declared here
fortran/lisf_system.c:158:6: note: code may be misoptimized unless '-fno-strict-aliasing' is used
lto1: some warnings being treated as errors
lto-wrapper: fatal error: x86_64-pc-linux-gnu-cc returned 1 exit status
compilation terminated.
/usr/lib/gcc/x86_64-pc-linux-gnu/13/../../../../x86_64-pc-linux-gnu/bin/ld: error: lto-wrapper failed
collect2: error: ld returned 1 exit status
make[2]: *** [Makefile:484: liblis.la] Error 1

Downstream report: https://bugs.gentoo.org/927587
Full build log: build.log

@anishida
Copy link
Owner

anishida commented Apr 1, 2024

Thank you for your comment. However, I could not reproduce it in my environment (gcc-4.8.4). It would be helpful if you could provide more information about your environment.

@eli-schwartz
Copy link
Author

eli-schwartz commented Apr 1, 2024

On Gentoo we have GCC 13 (gcc and gfortran frontends). Likely you'll need a recentish GCC as the diagnostic passes in GCC have gotten a lot more powerful over the years.

@anishida
Copy link
Owner

anishida commented Apr 1, 2024

Thank you for your reply. Can you give us some information now about the purpose of the flags you specified?

@anishida
Copy link
Owner

anishida commented Apr 1, 2024

What happens if you add '-fno-strict-aliasing', as the error message suggests?

@elsandosgrande
Copy link

elsandosgrande commented Apr 1, 2024

@anishida To quote a Gentoo Bugzilla ticket which provides a brief explanation of the flags above:

Here is a bit of explanation:

-Werror=lto-type-mismatch:
User to find possible runtime issues in packages. It likely means the package is unsafe to build & use with LTO.
For projects using the same identifier but with different types across different files, they must be fixed to be consistent across the codebase.

-Werror=odr:
Used to find possible runtime issues in packages. These bugs are a problem anyway but may be even worse when combined with LTO. C++ code must comply with the One Definition Rule (ODR) - see https://en.cppreference.com/w/cpp/language/definition#One_Definition_Rule.

-Werror=strict-aliasing:
Used to find possible runtime issues in packages. These bugs are a problem anyway but may be even worse when combined with LTO.

[…]

See also: https://marc.info/?l=gentoo-dev&m=165639574126280&w=2

@eli-schwartz
Copy link
Author

What happens if you add '-fno-strict-aliasing', as the error message suggests?

It doesn't help. The compiler diagnostic suggests it automatically, because it tells the compiler to pessimize the code generation (produce worse code, but refrain from trying to optimize by assuming the code is correct) and lto type mismatches can coincide with aliasing issues. But then the compiler still emits an lto type mismatch.

/bin/sh ../../libtool  --tag=CC   --mode=compile x86_64-pc-linux-gnu-cc -DHAVE_CONFIG_H -I. -I../../include  -I../../include   -march=native -fstack-protector-all -O2 -pipe -fdiagnostics-color=always -frecord-gcc-switches -flto=4 -Werror=odr -Werror=lto-type-mismatch -Werror=strict-aliasing  -Wformat -Werror=format-security -Werror=implicit-function-declaration -Werror=implicit-int -Werror=int-conversion -Werror=incompatible-pointer-types -fno-strict-aliasing -fopenmp -c -o lisf_system.lo lisf_system.c

[...]

libtool: compile:  x86_64-pc-linux-gnu-gfortran -DHAVE_CONFIG_H -I. -I../../include -I../../include -march=native -fstack-protector-all -O2 -pipe -fdiagnostics-color=always -frecord-gcc-switches -flto=4 -Werror=odr -Werror=lto-type-mismatch -Werror=strict-aliasing -fno-strict-aliasing -fopenmp -c lisf_init.F  -fPIC -o .libs/lisf_init.o

[...]

libtool: link: x86_64-pc-linux-gnu-cc -shared  -fPIC -DPIC  -Wl,--whole-archive array/.libs/libarray.a esolver/.libs/libesolver.a matrix/.libs/libmatrix.a matvec/.libs/libmatvec.a precision/.libs/libprecision.a precon/.libs/libprecon.a solver/.libs/libsolver.a system/.libs/libsystem.a vector/.libs/libvector.a fortran/.libs/libfortran.a -Wl,--no-whole-archive  -Wl,-rpath -Wl,//usr/lib/gcc/x86_64-pc-linux-gnu/13 -Wl,--as-needed -L/usr/lib/gcc/x86_64-pc-linux-gnu/13 -L/usr/lib/gcc/x86_64-pc-linux-gnu/13/../../../../lib64 -L/lib/../lib64 -L/usr/lib/../lib64 -L/usr/lib/gcc/x86_64-pc-linux-gnu/13/../../../../x86_64-pc-linux-gnu/lib -L/usr/lib/gcc/x86_64-pc-linux-gnu/13/../../.. -lgfortran //usr/lib/gcc/x86_64-pc-linux-gnu/13/libquadmath.so -lm  -march=native -fstack-protector-all -O2 -fdiagnostics-color=always -frecord-gcc-switches -flto=4 -Werror=odr -Werror=lto-type-mismatch -Werror=strict-aliasing -Werror=format-security -Werror=implicit-function-declaration -Werror=implicit-int -Werror=int-conversion -Werror=incompatible-pointer-types -fopenmp -Wl,-O1 -flto=4 -Werror=odr -Werror=lto-type-mismatch -Werror=strict-aliasing -Wl,--defsym=__gentoo_check_ldflags__=0 -fopenmp   -fopenmp -Wl,-soname -Wl,liblis.so.0 -o .libs/liblis.so.0.0.0
fortran/lisf_init.F:43:72: error: type of 'lis_set_argv_f' does not match original declaration [-Werror=lto-type-mismatch]
   43 |         call lis_set_argv(i,argv,ierr)
      |                                                                        ^
fortran/lisf_system.c:158:6: note: type mismatch in parameter 4
  158 | void lis_set_argv_f(LIS_INT *no, char *argv, LIS_INT *ierr, LIS_INT len)
      |      ^
fortran/lisf_system.c:158:6: note: type 'LIS_INT' should match type 'long int'
fortran/lisf_system.c:158:6: note: 'lis_set_argv_f_' was previously declared here
fortran/lisf_system.c:158:6: note: code may be misoptimized unless '-fno-strict-aliasing' is used
lto1: some warnings being treated as errors
lto-wrapper: fatal error: x86_64-pc-linux-gnu-cc returned 1 exit status
compilation terminated.
/usr/lib/gcc/x86_64-pc-linux-gnu/13/../../../../x86_64-pc-linux-gnu/bin/ld: error: lto-wrapper failed
collect2: error: ld returned 1 exit status
make[2]: *** [Makefile:484: liblis.la] Error 1

Even though -fno-strict-aliasing is used, the LTO optimizer still sees a type mismatch and diagnoses it as a dangerous problem. It also still suggests again you may want to use -fno-strict-aliasing "if you aren't already", which I think happens because the lto-wrapper doesn't know how the code was originally compiled during the first pass?

@anishida
Copy link
Owner

anishida commented Apr 1, 2024

What if there is no '-Werror=strict-aliasing'?
We have a somewhat special configuration for the case where we use 64-bit integers, so a radical solution may be difficult.

@anishida
Copy link
Owner

anishida commented Apr 1, 2024

Specifically, LIS_INT is used instead of int or long int so that it can be switched depending on the settings in the configure script. This is probably the cause of the problem.

@eli-schwartz
Copy link
Author

eli-schwartz commented Apr 1, 2024

I'd like to back up a bit.

This is actually not about -Werror=strict-aliasing at all. I have that flag as it's one of several possible issues that can indicate miscompilation, but it's not triggering issues with lis anyway. The following minimal flags are sufficient to reproduce on my GCC 13 gentoo system:

CFLAGS="-fdiagnostics-color=always -fno-strict-aliasing -flto=4 -Werror=lto-type-mismatch"
FFLAGS="-fdiagnostics-color=always -fno-strict-aliasing -flto=4 -Werror=lto-type-mismatch"
LDFLAGS="-flto=4 -Werror=lto-type-mismatch"

(Note that fno-strict-aliasing is used to demonstrate that despite the warning suggesting to try it and see if it helps, it doesn't help. The fdiagnostics-color flag simply provides pretty colored output so it's easier to manually read the logs.)

The resulting compile error:

libtool: compile:  x86_64-pc-linux-gnu-cc -DHAVE_CONFIG_H -I. -I../../include -I../../include -fdiagnostics-color=always -fno-strict-aliasing -flto=4 -Werror=lto-type-mismatch -fopenmp -c lisf_system.c  -fPIC -DPIC -o .libs/lisf_system.o
libtool: compile:  x86_64-pc-linux-gnu-gfortran -DHAVE_CONFIG_H -I. -I../../include -I../../include -fdiagnostics-color=always -fno-strict-aliasing -flto=4 -Werror=lto-type-mismatch -fopenmp -c lisf_init.F  -fPIC -o .libs/lisf_init.o

[...]

libtool: link: x86_64-pc-linux-gnu-cc -shared  -fPIC -DPIC  -Wl,--whole-archive array/.libs/libarray.a esolver/.libs/libesolver.a matrix/.libs/libmatrix.a matvec/.libs/libmatvec.a precision/.libs/libprecision.a precon/.libs/libprecon.a solver/.libs/libsolver.a system/.libs/libsystem.a vector/.libs/libvector.a fortran/.libs/libfortran.a -Wl,--no-whole-archive  -L/usr/lib/gcc/x86_64-pc-linux-gnu/13 -L/usr/lib/gcc/x86_64-pc-linux-gnu/13/../../../../lib64 -L/lib/../lib64 -L/usr/lib/../lib64 -L/usr/lib/gcc/x86_64-pc-linux-gnu/13/../../../../x86_64-pc-linux-gnu/lib -L/usr/lib/gcc/x86_64-pc-linux-gnu/13/../../.. -lgfortran /usr/lib/gcc/x86_64-pc-linux-gnu/13/libquadmath.so -lm  -fdiagnostics-color=always -flto=4 -Werror=lto-type-mismatch -fopenmp -flto=4 -Werror=lto-type-mismatch -fopenmp   -fopenmp -Wl,-soname -Wl,liblis.so.0 -o .libs/liblis.so.0.0.0
fortran/lisf_init.F:43:72: error: type of 'lis_set_argv_f' does not match original declaration [-Werror=lto-type-mismatch]
   43 |         call lis_set_argv(i,argv,ierr)
      |                                                                        ^
fortran/lisf_system.c:158:6: note: type mismatch in parameter 4
  158 | void lis_set_argv_f(LIS_INT *no, char *argv, LIS_INT *ierr, LIS_INT len)
      |      ^
fortran/lisf_system.c:158:6: note: type 'LIS_INT' should match type 'long int'
fortran/lisf_system.c:158:6: note: 'lis_set_argv_f_' was previously declared here
fortran/lisf_system.c:158:6: note: code may be misoptimized unless '-fno-strict-aliasing' is used
lto1: some warnings being treated as errors
lto-wrapper: fatal error: x86_64-pc-linux-gnu-cc returned 1 exit status
compilation terminated.
/usr/lib/gcc/x86_64-pc-linux-gnu/13/../../../../x86_64-pc-linux-gnu/bin/ld: error: lto-wrapper failed
collect2: error: ld returned 1 exit status
make[2]: *** [Makefile:484: liblis.la] Error 1

@anishida
Copy link
Owner

anishida commented Apr 1, 2024

What if there is also no 'Werror=lto-type-mismatch'? From the error message, it seems that this option is also related to LIS_INT.

@eli-schwartz
Copy link
Author

eli-schwartz commented Apr 11, 2024

I discuss this in my initial report:

Note the -Werror=* flags are used to help detect cases where the compiler tries to optimize by assuming UB cannot exist in the source code -- if it does exist, ordinarily the code would be miscompiled, and this says to make the miscompilation a fatal error.

I would prefer not to remove a diagnostic flag that informs me that there is a coding error in the source code.

The "correct" solution is to make sure that the Fortran code and the C code agree on what type is being used for the interface in question. I'm not sure how hard that is -- it may be more complicated due to the nature of multiple languages (especially older Fortran standards) but I simply lack experience with Fortran so I couldn't tell you for sure...

@anishida
Copy link
Owner

Thank you for your reply. So far, there have been no reports of errors in this library.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants