From 1bea95fc19a93ec923b749921f175ffc759dc2a4 Mon Sep 17 00:00:00 2001 From: Ryan Houdek Date: Tue, 3 Dec 2024 12:16:13 -0800 Subject: [PATCH] _posts/2024-12-03-FEX-2412.md --- _posts/2024-12-03-FEX-2412.md | 84 +++++++++++++++++++++++++++++++++++ 1 file changed, 84 insertions(+) create mode 100644 _posts/2024-12-03-FEX-2412.md diff --git a/_posts/2024-12-03-FEX-2412.md b/_posts/2024-12-03-FEX-2412.md new file mode 100644 index 0000000..0bc2176 --- /dev/null +++ b/_posts/2024-12-03-FEX-2412.md @@ -0,0 +1,84 @@ +--- +layout: post +title: FEX 2412 Tagged +author: FEX-Emu Maintainers +--- + +Welcome back to another FEX release! With the last month's release being cancelled there is a few more changes than average. + +## Fix Steam's webhelper process again +On [November 5th](https://store.steampowered.com/news/app/593110/view/4472730495692571024), Valve's Steam client updated their embedded version of +Chromium. This uncovered new bugs in FEX-Emu and caused Steam to constantly restart its UI. Luckily [Asahi Lina](https://vt.social/@lina) diagnosed +the issues and resolved the problems that were occuring! Now Steam is more stable than it has ever been! + +## Fix WINE path execution problems +Asahi Lina had encountered a bug when running WINE under Fedora which wasn't seen before. Turns out FEX-Emu was incorrectly concating some filepaths +together when child processes were getting executed inside WINE. This usually worked but due to the combination of a newer version of WINE and how +Fedora set up its paths caused this to break under FEX. With this issue resolved, WINE is working again. There are likely some more bugs due to +the nature of how FEX sets up its path handling, but those will likely get solved in time. + +## Emulate PAUSE instruction using WFE instead of YIELD +This is mostly just a passing curiousity since it doesn't really matter. FEX has long emulated the x86 instruction PAUSE by using a theoretically +functionally equivalent ARM64 instruction called YIELD. It came to our attention over the last couple months that Apple Silicon and ARM Cortex +implement the YIELD instruction as a nop, so it effectively does nothing. This kind of makes sense since it is typically useful for SMT systems, and +no relevant ARM cores support SMT. + +On x86 CPUs this matters more since they do support SMT, but PAUSE depending on CPU will also have the CPU core sleep for some number of nanoseconds +that is implementation dependent. To be more similar to how PAUSE operates, we have instead switched over to using the WFE instruction which provides +similar guarantees on most ARM64 hardware. The trade-off here is that Qualcomm's new Oryon CPU treats WFE as a nop instead. A reasonable trade-off and +just an interesting quirk of their hardware. + +## Adds support for compat input prctl +A major problem that constantly comes back to bite FEX is emulating 32-bit syscalls with complete accuracy. Due to the Linux Kernel hiding information +about a syscall returning opaque data structures, FEX jumps through a lot of hoops to make sure everything works while running as a 64-bit ARM +process. + +The ioctl syscall is one of these annoying syscalls that completely change how they behave depending on the device that is being communicated with, +requiring FEX to track file descriptors and wrap all ioctl syscalls with checks see which ioctl it is, to which device. This is fraught with problems +since we can't always know 100% if we are actually communicating with a device that we think it is. + +This leads us to this new bug that we encountered. When a 32-bit process is communicating with a gamepad, it uses a combination of the joydev and +udev subsystems inside of the Linux kernel. Any 32-bit game was having issues where gamepads where the controls were "sticky". +While FEX is catching the ioctl communications and translating the data as necessary, there was a more nefarious problem happening. Turns out the +Linux kernel in some instances will check to see if a syscall is coming from either a "compat" process, or is inside of a "compat" syscall. A +compatibility process is just a process in which the Linux kernel knows it is a 32-bit process, while the kernel is 64-bit. In x86 land this is easy +since you just check if it is x86 or x86-64. When running under FEX, the process is /never/ a compatibility process, because it is always 64-bit. + +Turns out that the read and write syscalls, depending on what type of device it is communicating with, will return different data if it is compat or +not! This isn't just specific to gamepads, there are various subsystems in Linux that check if they are in a compat process and modify the data. This +is a huge problem since we would introduce a ton of overhead tracking every read, write, seek, etc syscall and try to compare it against something +that needs modified data or not. [slp](https://github.com/slp) decided to tackle this for the Asahi Linux distro by adding a new prctl to changing the +data to the "compat" versions if enabled, even in a 64-bit process. This is a bandage to a significant problem as it only covers gamepad devices +today. We will continue tracking these issues in to more subsystems as we find them over on our [32Bit Syscall +Woes](https://wiki.fex-emu.com/index.php/Development:32Bit_Syscall_Woes) wiki page. + +## Support CPU index through TPIDRRO +This is a fun little feature to make FEX's life easier. When FEX is emulating the CPUID and RDTSCP instructions, we need to know the physical CPU +index because the instructions will give that information to the program. The RDTSCP instruction gives the index directly, while the CPUID instruction +uses it for per-cpu data lookup and also some indexing data. + +The fun thing is that Windows is the only operating system that supports this today. So FEX only gets to enjoy this in a little toy sandbox for when +it is running under Windows. The linux implementation on the other hand does a syscall to **getcpu** for each one of these instructions getting +emulated, which adds a bunch of overhead. This is due to Linux not supporting using TPIDRRO in this fashion, instead the register is set to zero. + +This has an additional knock-on effect that if WINE wants to run Arm64 native Windows applications, this is going to break any algorithm that relies +on that Windows feature. Ideally the Linux kernel implements this feature for both WINE compatibility, and FEX performance improvements some day! + +## FEXCore cleanups and restructuring +While not user facing at all, there has been some restructuring of how FEXCore, the CPU emulation backend of FEX-Emu, exposes its CPU emulation. +Historically FEX-Emu has had a problem where we leak OS information across the boundary, making the interface to the CPU backend more convoluted than +it needed to be. Over the last couple of months we have been removing large amounts of the Linux specific code from FEXCore and moving it in to the +frontend. + +While this doesn't affect the user, it makes our programmer's lives easier since they don't need to dance around OS abstractions quite as much. It +also makes it easier for upcoming debugger improvements that FEX-Emu is sorely lacking today. + +## Fixed some JIT bugs +We uncovered some implementation bugs in our AVX emulation this last month. There were four instructions that had some incorrect behaviour. While +fixed, we don't know of any actual bugs in games that happened because of these edge case failures. It's a good testament towards having unittesting +infrastructure that can catch behaviours like this, and knowing that even with thousands of hand-written assembly tests, some edge cases still aren't +tested. + +There were also a couple of other misc optimizations and bug fixes but we've gone on for quite a while now. + +See the [2412 Release Notes](https://github.com/FEX-Emu/FEX/releases/tag/FEX-2412) or the [detailed change log](https://github.com/FEX-Emu/FEX/compare/FEX-2410...FEX-2412) in Github.