Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Build thunks GuestLibs on other arm64 distros (say, Arch Linux) #1996

Open
phire opened this issue Sep 16, 2022 · 16 comments
Open

Build thunks GuestLibs on other arm64 distros (say, Arch Linux) #1996

phire opened this issue Sep 16, 2022 · 16 comments

Comments

@phire
Copy link
Member

phire commented Sep 16, 2022

Currently we rely on debian's gcc-x86-64-linux-gnu package to supply a working x86_64 cross-compiler on Debian/Ubuntu distros. But other distros (like Arch Linux) are missing such handy packages.

I've done a bit of preliminary research, and it seems like these are our rough options:

Option One - Use clang instead:

Clang helpfully includes compilers for all triples by default, and cross-compiling is simply a matter of passing the correct triple into clang/clang++. This seems like an elegant solution, we already require clang for thunk generation, why not use it for building thunks too.

However... This solution only gets you a compiler.
The gcc-x86-64-linux-gnu package (and the g++/multilib-i686 variants) actually pull in a bunch of other packages to supply a full libc and libc++ and other support libraries that our thunks need to build. If we go with the clang option, then we need to supply the missing include from somewhere else.

But from where? We have a two potential options:

  • Most of the include files can be found in the host, though we will need to supply the missing x86 bits of linux headers, glibc and libstdc++.
    • This option runs the risk that other random libraries will change their include files slightly depending on host arch, and might end up doing the wrong thing
    • We will be missing the .so files, can we create our thunks without them?
  • We just supply all the includes, either:
    • a rootfs style download (generated along-side our rootfs? That would keep them in-sync)
    • as a submodule

Option Two - Generate equivalent packages for Arch

These packages are clearly useful for Debian, would be nice to have equivalents in Arch Linux ARM.

There are some downsides to this approach:

  1. these packages will be in AUR and will take a while to build from source (especially if people are building them on lower-end ARM SBCs, maybe we could supply equivalent binary packages?)
  2. We would need to maintain them
  3. It only fixes the issue for Arch Linux ARM and derivatives; Other distros still won't be able to build thunks.

Idea three - Just use a native x86 toolchain inside our rootfs

This is an ugly and lazy option. Requires a large download (and probably unpacking the rootfs to install a compiler)


Whatever solution we go with, we should standardise across all distros, so we aren't using widely different approaches on different distros. I'm currently leaning towards Option One, with us packaging headers, not just for libc/libstdc++, but all libraries we thunk.

@neobrain
Copy link
Member

neobrain commented Sep 16, 2022

Is the issue on Arch that there is no x86 glibc/libstdc++ package available, or not even the x86 cross-compiler itself? I'm guessing chances are slim someone else already wrote an AUR for this?

Some other things worth considering alongside this issue:

  1. We may need to thunk multiple versions of the same library (with ABI changes), which could require having multiple versions of the headers around
  2. As you mentioned, library headers might differ between ARM/x86 (consider autogenerated config.h-style files that define types)
  3. Ideally, build system integration wouldn't suffer too much from externally supplied headers (we probably don't want to just globally add hundreds of different library include directories)

Idea three is a nonsolution for (1). Git submodules also are a bad fit for it, since we'd have to start including the same repository from multiple submodules. I'd also like to avoid ending up with a megarepo with hundreds of libraries that must be updated individually without proper tooling.

Depending on how complex this gets, we may be better off resorting to a containerized Ubuntu install. Maybe the cross-compiler could be provided in a Ubuntu podman instance that is integrated using toolbox on non-Ubuntu systems? That way, whatever library structure we end up using would be versionable in a Dockerfile as well.

That is not something we need to tackle in its entirety immediately, but the approach we choose to support cross-compilation on non-Ubuntu distros shouldn't require us to throw everything over again once we start worrying about points (1)-(3). Can we get away with just requiring an x86 toolchain with standard libraries to be provided by the user for now, and provide a minimal toolbox environment to satisfy this requirement on Arch?

Some other random thoughts:

We will be missing the .so files, can we create our thunks without them?

Guest .so files aren't used in the build process, so that should be fine.

It only fixes the issue for Arch Linux ARM and derivatives; Other distros still won't be able to build thunks.

Just to add one data point, I think openSUSE can almost build thunks out-of-the-box (there's merely two minor issues unrelated to the toolchain). This may change once support for 32-bit thunks is added, though.

Idea three - Just use a native x86 toolchain inside our rootfs

I'm not sure this can easily be done without chrooting into the rootfs, or without invasively copying files from the rootfs into the global system directories.

@phire
Copy link
Member Author

phire commented Sep 16, 2022

Is the issue on Arch that there is no x86 glibc/libstdc++ package available, or not even the x86 cross-compiler itself?

Not even a cross-compiler.
But it would be easy enough to make a pkgbuild for AUR

  1. We may need to thunk multiple versions of the same library (with ABI changes), which could require having multiple versions of the headers around

This is actually a very good point; And it might be worth designing a system explictly to handle this case.

Git submodules also are a bad fit

Yeah, lets stay away from submodules. I think we want to go with something deterministic.

Depending on how complex this gets, we may be better off resorting to a containerized Ubuntu install

Eh.... I think containers will be a mistake for this. If we use clang as the cross-compiler, then the only extra files we need are headers.


What I'm considering right now is a deterministic system that downloads and extracts just the header files (from Debian or Ubuntu packages?) into per-library folders at configure time. I think I have experimental code somewhere that can be adapted for this.

Then we just pass the required include paths into the build for each guest thunk library.

This way, we avoid full containers; We avoid git submodules; We avoid any chrooting; And we avoid all dependencies on host packages (except clang)

@neobrain
Copy link
Member

Eh.... I think containers will be a mistake for this. If we use clang as the cross-compiler, then the only extra files we need are headers.

Yup, if providing a cross-toolchain for Arch isn't that big of a deal, then containers don't provide much of a benefit here.

What I'm considering right now is a deterministic system that downloads and extracts just the header files (from Debian or Ubuntu packages?) into per-library folders at configure time. I think I have experimental code somewhere that can be adapted for this.

Yeah, that sounds fine for the time being too. Some libraries might need additional defines (which would normally be set up by pkgconf/CMake), but we can continue setting these on a case-by-case basis for now.

@phire
Copy link
Member Author

phire commented Sep 17, 2022

I'll whip up a prototype on Monday, double check that this is a sane direction.

Theoretically, such a system could extended in the future to automatically handle pkgconf, but we might be better off with the extra control that setting these manually gets us.

@neobrain
Copy link
Member

neobrain commented May 2, 2024

What about https://andrewkelley.me/post/zig-cc-powerful-drop-in-replacement-gcc-clang.html

https://ziglang.org/learn/overview/#cross-compiling-is-a-first-class-use-case

FEX is a C++ project. If you see an applicable idea in this 40-minute-long article, please summarize your proposal here.

@teohhanhui
Copy link
Contributor

teohhanhui commented May 2, 2024

please summarize your proposal here

Sorry, I'm not familiar enough with things to be able to offer any concrete proposals. I just came across this issue while trying to compile thunks on Fedora, which lacks a cross gcc toolchain.

FEX is a C++ project.

But it does seem like zig supports C++ too (including libc++): https://ziglang.org/download/0.11.0/release-notes.html#zig-c

@alyssarosenzweig
Copy link
Collaborator

please summarize your proposal here

Sorry, I'm not familiar enough with things to be able to offer any concrete proposals. I just came across this issue while trying to compile thunks on Fedora, which lacks a cross gcc toolchain.

FEX is a C++ project.

But it does seem like zig supports C++ too (including libc++): https://ziglang.org/download/0.11.0/release-notes.html#zig-c

Oh, this is a neat idea. Going to give this a whirl, thanks for the suggestion!

@alyssarosenzweig
Copy link
Collaborator

My FEX repo's zig/cc branch has a proof of concept of building with x86_64 thunks with zig cc on fedora 39, no special rootfs's etc. glxgears works, so far no steam games do. Still promising, think I'll push on this some more next week.

@teohhanhui Is this interesting from a fedora packaging perspective? or just for developer convenience?

@teohhanhui
Copy link
Contributor

teohhanhui commented May 3, 2024

@teohhanhui Is this interesting from a fedora packaging perspective?

I'm not (yet?) a Fedora packager, just someone trying to build a package: https://github.com/teohhanhui/rpms/tree/f39/fex-emu (but hopefully I could get this accepted into Fedora repos?)

So I think you'd have to ask someone else ^^

Might have better luck with f40? zig in f39 is 2 major versions behind.

@Sonicadvance1
Copy link
Member

Anything that is found that could make packaging easier for Fedora on our end would be good to know. Currently I manage the Canonical launchpad ppa builds which mostly works out of the box, so I don't know what troubles other distros would have.
Thunk cross compiling being a mighty painful one on non-multiarch systems and I have no idea how to solve that other than their builders creating an development image or pulling one from somewhere.

@teohhanhui
Copy link
Contributor

teohhanhui commented May 3, 2024

For Fedora, I wonder if there'd be any problems if we just ask the user to create a separate installroot for the guest libs? So just build the guest libs normally on the x86_64 build, and install it on your aarch64 system to a separate installroot e.g. /x86_64

Could only the HostLibs be built as part of the aarch64 build, and would that work with GuestLibs built as part of the x86_64 build?

@alyssarosenzweig
Copy link
Collaborator

For Fedora, I wonder if there'd be any problems if we just ask the user to create a separate installroot for the guest libs? So just build the guest libs normally on the x86_64 build, and install it on your aarch64 system to a separate installroot e.g. /x86_64

Could only the HostLibs be built as part of the aarch64 build, and would that work with GuestLibs built as part of the x86_64 build?

That's part of what #3597 does, is just annoying if you don't have x86 hardware 😉

@asahilina
Copy link
Contributor

Could only the HostLibs be built as part of the aarch64 build, and would that work with GuestLibs built as part of the x86_64 build?

This does not work, because the host thunk generator relies on having both a working clang environment for the host arch and the guest arch. We'd need to serialize the data_layout on an x86_64 build, and then use that serialized representation for the host build on aarch64.

The easiest solution is the last commit in #3597, and I just verified that it allows building a full thunked FEX entirely on aarch64 Fedora, using a partial rootfs (just includes and some libs) from x86_64. I submitted that as #4230. With that, you can build FEX on Fedora aarch64 given:

A sysroot from an x86_64 machine

With the required devel packages and build environment installed, don't have a proper dependency list handy yet...

From the live distro root, a tarball of the necessary bits can be built like this (the tree ends up being 200M raw 40MB gzipped, probably has a bunch of unnecessary includes since I didn't do a clean container):

tar cvzf ~/sysroot.tar.gz usr/include usr/lib/gcc usr/lib/clang usr/lib/lib[cm][_.]* usr/lib64/libmvec* usr/lib/crt*.o usr/lib64/lib[cm][_.]* usr/lib64/crt*.o lib64 lib usr/lib/ld* usr/lib64/ld* usr/lib64/libstdc++* usr/lib/libstdc++* usr/lib*/libgcc_s*

These toolchain config override files

Because Fedora's compilers are configured differently for native vs. cross-compile.

toolchain_x86_32.cmake

set(CMAKE_SYSTEM_PROCESSOR i686)
set(CMAKE_C_COMPILER x86_64-linux-gnu-gcc -m32 --sysroot=/home/lina/x86-rootfs -L/home/lina/x86-rootfs/usr/lib/gcc/i686-redhat-linux/14/)
set(CMAKE_CXX_COMPILER x86_64-linux-gnu-g++ -m32 --sysroot=/home/lina/x86-rootfs -L/home/lina/x86-rootfs/usr/lib/gcc/i686-redhat-linux/14/ -I/home/lina/x86-rootfs/usr/include/c++/14 -I/home/lina/x86-rootfs/usr/include/c++/14/i686-redhat-linux)

toolchain_x86_64.cmake

set(CMAKE_SYSTEM_PROCESSOR x86_64)
set(CMAKE_C_COMPILER x86_64-linux-gnu-gcc --sysroot=/home/lina/x86-rootfs -L/home/lina/x86-rootfs/usr/lib/gcc/x86_64-redhat-linux/14/)
set(CMAKE_CXX_COMPILER x86_64-linux-gnu-g++ --sysroot=/home/lina/x86-rootfs -L/home/lina/x86-rootfs/usr/lib/gcc/x86_64-redhat-linux/14/ -I/home/lina/x86-rootfs/usr/include/c++/14 -I/home/lina/x86-rootfs/usr/include/c++/14/x86_64-redhat-linux)

(I only tested gcc thunk builds, clang would need different settings)

The right CMake config

For manual builds I'm using:

CC=clang CXX=clang++ cmake -DCMAKE_INSTALL_PREFIX=/usr -DCMAKE_BUILD_TYPE=Release -DUSE_LINKER=lld -DENABLE_LTO=True -DBUILD_TESTS=False -DBUILD_THUNKS=True -DX86_DEV_ROOTFS=$HOME/x86-rootfs -DX86_32_TOOLCHAIN_FILE=$PWD/toolchain_x86_32.cmake -DX86_64_TOOLCHAIN_FILE=$PWD/toolchain_x86_64.cmake -G Ninja

With all this, FEX builds with thunks and it should "just work" (with AsahiLinux/muvm#132 which is a fix to make the thunk stuff visible in the guest).

@asahilina
Copy link
Contributor

asahilina commented Dec 20, 2024

I managed to reduce the sysroot to just a handful of -devel RPMs and zero libraries (and one funny hack), so the plan now is to just package up those RPMs ad-hoc into an archive the FEX RPM spec can consume, and use that for the build.

alsa-lib-devel-1.2.13-3.fc40.x86_64
glibc-devel-2.39-30.fc40.i686
glibc-devel-2.39-30.fc40.x86_64
kernel-headers-6.12.4-100.fc40.x86_64
libdrm-devel-2.4.123-1.fc40.x86_64
libglvnd-devel-1.7.0-4.fc40.x86_64
libstdc++-devel-14.2.1-3.fc40.x86_64
libX11-devel-1.8.10-2.fc40.x86_64
libxcb-devel-1.17.0-2.fc40.x86_64
libXrandr-devel-1.5.4-3.fc40.x86_64
libXrender-devel-0.9.11-6.fc40.x86_64
mesa-libEGL-devel-24.1.7-1.fc40.x86_64
wayland-devel-1.23.0-2.fc40.x86_64
xorg-x11-proto-devel-2024.1-2.fc40.noarch

I switched to clang for the build, toolchain files now look like:

set(CMAKE_EXE_LINKER_FLAGS_INIT "-fuse-ld=lld")
set(CMAKE_MODULE_LINKER_FLAGS_INIT "-fuse-ld=lld")
set(CMAKE_SHARED_LINKER_FLAGS_INIT "-fuse-ld=lld")
set(CMAKE_C_COMPILER clang)
set(CMAKE_CXX_COMPILER clang++)
set(CLANG_FLAGS "-nodefaultlibs -nostartfiles -target i686-linux-gnu --sysroot=${X86_DEV_ROOTFS} -msse2 -mfpmath=sse")
set(CMAKE_C_FLAGS "${CMAKE_C_FLAGS} ${CLANG_FLAGS}")
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} ${CLANG_FLAGS} -I${X86_DEV_ROOTFS}/usr/include/c++/14/ -I${X86_DEV_ROOTFS}/usr/include/c++/14/x86_64-redhat-linux/32")
set(CMAKE_C_COMPILER_FORCED TRUE)
set(CMAKE_CXX_COMPILER_FORCED TRUE)
set(CMAKE_EXE_LINKER_FLAGS_INIT "-fuse-ld=lld")
set(CMAKE_MODULE_LINKER_FLAGS_INIT "-fuse-ld=lld")
set(CMAKE_SHARED_LINKER_FLAGS_INIT "-fuse-ld=lld")
set(CMAKE_C_COMPILER clang -target x86_64-linux-gnu)
set(CMAKE_CXX_COMPILER clang++ -target x86_64-linux-gnu)
set(CLANG_FLAGS "-nodefaultlibs -nostartfiles -target x86_64-linux-gnu --sysroot=${X86_DEV_ROOTFS}")
set(CMAKE_C_FLAGS "${CMAKE_C_FLAGS} ${CLANG_FLAGS}")
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} ${CLANG_FLAGS} -I${X86_DEV_ROOTFS}/usr/include/c++/14/ -I${X86_DEV_ROOTFS}/usr/include/c++/14/x86_64-redhat-linux")
set(CMAKE_C_COMPILER_FORCED TRUE)
set(CMAKE_CXX_COMPILER_FORCED TRUE)

The hack is you need to touch usr/lib/gcc/x86_64-redhat-linux/14/crtbegin.o and usr/lib/gcc/x86_64-redhat-linux/14/32/crtbegin.o in the sysroot to get clang (when run as part of thunkgen) to pick up the C++ includes from the right place.

@asahilina
Copy link
Contributor

Just for closure, this is what we ended up doing: https://src.fedoraproject.org/rpms/fex-emu/pull-request/7

I realized that thunkgen is happy to use native includes for the thunked lib headers (it appends /usr/include to the include paths), so I adjusted the toolchain files to match. That means we only end up needing glibc & friends in the sysroot.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants