Skip to content

std.os.linux: Add tests for socket-related syscalls, remove usages of socketcall #23853

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

mpfaff
Copy link
Contributor

@mpfaff mpfaff commented May 11, 2025

This PR makes a number of changes discussed in #23844.

During my testing, I discovered that Zig's generated syscall numbers for x32 were incorrect.

The x32 ABI requires that the syscall numbers be ORed with __X32_SYSCALL_BIT (which is 0x40000000) to distinguish them from x86_64 syscalls. However, Zig currently does not do this.

Perhaps this has not been an issue until now because the syscalls that were tested had the same "base" numbers as their x86_64 counterparts and had a similar enough ABI that things worked out. However, some syscalls do not have the same "base" number. As a result, trying to call these syscalls without the indicator bit set fails with ENOSYM.

I tried out two approaches to resolving this issue. The first you can see in commit 6ac528a. The second you can see in commit 7b20d52.

About that second approach, I haven't been able to test it myself as I haven't figured out how to run tools/generate_linux_syscalls.zig yet (I'm not sure how to get the headers it needs. Seems like it's more involved than simply pointing it at the kernel source code), but you can see my attempt at modifying that program to set the indicator bit. I reused the code that adds an "offset" for the mips syscalls. I'm not sure if that's appropriate. It would be more correct to OR it instead of adding, although you can see that musl also adds instead of ORing.

For the time being I've reverted the second approach and reapplied the first.

I'm not sure if this style of testing is what you would like to see, but I thought it the best way to make sure the syscall functions were implemented currently.

Once I got those tests implemented and passing, I went ahead and removed the usages of socketcall.


There is an additional note about the x32 ABI on the syscall man page that I think needs some attention:

Most of these additional system calls are actually identical to the system calls used for providing i386 compat. There are some notable exceptions, however, such as preadv2(2), which uses struct iovec entities with 4-byte pointers and sizes ("compat_iovec" in kernel terms), but passes an 8-byte pos argument in a single register and not two, as is done in every other ABI.

If I understand correctly, "pos argument" refers to the offset argument. If that is the case, this exception is not handled by the current implementation of preadv2 in Zig. I'm not yet sure what the other exceptions are, but I intend to look into it and make another PR for those further changes.

@alexrp
Copy link
Member

alexrp commented May 11, 2025

The x32 ABI requires that the syscall numbers be ORed with __X32_SYSCALL_BIT (which is 0x40000000) to distinguish them from x86_64 syscalls. However, Zig currently does not do this.

Good find!

Perhaps this has not been an issue until now because the syscalls that were tested had the same "base" numbers as their x86_64 counterparts and had a similar enough ABI that things worked out. However, some syscalls do not have the same "base" number. As a result, trying to call these syscalls without the indicator bit set fails with ENOSYM.

It hasn't been an issue because we don't test x32 binaries at all (there's no qemu-x32). 🙂 On top of that, the std.os.linux syscall layer is only used when not linking libc, and we don't even have arch bits for x32 yet (#22189).

Note that lib/std/os/linux/x86_64.zig is not the arch bits file that will be used for x32; it will get a separate x32.zig once actually implemented.

I tried out two approaches to resolving this issue. The first you can see in commit 6ac528a. The second you can see in commit 7b20d52.

Fixing this in the generator is the right approach, although I would suggest opening a separate PR for that.

If I understand correctly, "pos argument" refers to the offset argument. If that is the case, this exception is not handled by the current implementation of preadv2 in Zig. I'm not yet sure what the other exceptions are, but I intend to look into it and make another PR for those further changes.

This isn't too surprising since our x32 support in std.os.linux is very incomplete. Patches definitely welcome.

@alexrp
Copy link
Member

alexrp commented May 11, 2025

About that second approach, I haven't been able to test it myself as I haven't figured out how to run tools/generate_linux_syscalls.zig yet (I'm not sure how to get the headers it needs. Seems like it's more involved than simply pointing it at the kernel source code)

zig run tools/generate_linux_syscalls.zig -- $PWD $PWD/../linux is how it's supposed to work. Perhaps something broke it?

Copy link
Contributor

@rootbeer rootbeer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The socket syscall changes here look good to me. The test seem very thorough! (Though as Alex notes they belong in the linux test.zig.)

I think you know, but just to be sure: the .x86 native_arch is different from the x32 arch. Both are 32-bit x86 flavors, but one is the "native" 32-bit (sometimes called "i386"), and x32 is the 32-bit-on-x64 pseudo architecture. As such, I'd suggest splitting the x32 changes out from the socketcall changes (but its not a big deal, and I'm not empowered to make any demands).

It hasn't been an issue because we don't test x32 binaries at all (there's no qemu-x32).

If you're on Debian, you can enable x32 binary support with a kernel command line option (see https://wiki.debian.org/X32Port). Other distros probably require the kernel to be recompiled with x32 support enabled. I'm not sure if either of these options are viable for Zig's CI....

I believe that note you found about preadv is handled correctly by Zig because Zig uses the x64 syscall ABI as the basis for the x32 implementation. There are still a ton of problems, but in my experience its mostly due to how usize is used as "native register size" in much Zig code, but that is smaller than a (64-bit) register on x32. (I have a branch with a lot of clean up for this, but I haven't pushed any of it. I will try to get more of it out in public if you're also interested in improving x32).

@mpfaff
Copy link
Contributor Author

mpfaff commented May 11, 2025

If you're on Debian, you can enable x32 binary support with a kernel command line option (see https://wiki.debian.org/X32Port). Other distros probably require the kernel to be recompiled with x32 support enabled. I'm not sure if either of these options are viable for Zig's CI....

I can't speak for other distros, but at least the linux-zen kernel on Arch Linux has x32 support enabled by default.

@mpfaff mpfaff force-pushed the linux-no-socketcall branch from 13270c0 to d69f1f5 Compare May 11, 2025 17:30
@mpfaff
Copy link
Contributor Author

mpfaff commented May 11, 2025

I've removed the x32 fixes from this PR and left them in a new branch in my fork for now. I'll try to get them cleaned up a bit and open a new PR.

After rebasing on master, I'm unable to compile all of the x86_64-linux and x86-linux tests on macOS. This is the error I'm getting:

error: sub-compilation of musl libc.a failed
    note: failed to check cache: 'lib/libc/musl/src/string/bzero.c' file_hash FileNotFound

I suspect it has to do with the changes made by #23812, however, I can't find any clear reference to bzero.c left in the Zig codebase... Clearing .zig-cache and ~/.cache/zig did not resolve the issue.

@alexrp
Copy link
Member

alexrp commented May 11, 2025

You need to rebuild your compiler; it probably still has the old source file list.

@mpfaff
Copy link
Contributor Author

mpfaff commented May 11, 2025

Heh, I just realized that but you beat me to it. Thanks for confirming!

@mpfaff
Copy link
Contributor Author

mpfaff commented May 11, 2025

zig run tools/generate_linux_syscalls.zig -- $PWD $PWD/../linux is how it's supposed to work. Perhaps something broke it?

I'm not sure the first $PWD is right. Seems like it needs to be the path to the zig executable.

With that slight change, and the right path to the Linux source code on my system, this is the error I'm getting:

arch/arm64/include/uapi/asm/unistd.h:2:10: fatal error: 'asm/unistd_64.h' file not found
    2 | #include <asm/unistd_64.h>
      |          ^~~~~~~~~~~~~~~~~
1 error generated.

zig cc exited with code 1

From poking around a little bit, it seems like that file should be generated by the build system, but I'm not sure how.

@alexrp
Copy link
Member

alexrp commented May 11, 2025

Ah, I forgot for a minute - the kernel switched to syscall tables for all ports, which means our generator needs to be changed completely. See #21440 (cc @The-King-of-Toasters). I suggest just leaving this alone for now, but it would be good to file an issue for the x32 syscall offset thing so it isn't forgotten.

@mpfaff
Copy link
Contributor Author

mpfaff commented May 11, 2025

Alright, I've opened #23859 to track the x32 syscall issue.

@alexrp alexrp changed the title std.os.linux: Add tests for socket-related syscalls, fix x32 syscall numbers, remove usages of socketcall std.os.linux: Add tests for socket-related syscalls, remove usages of socketcall May 11, 2025
@mpfaff mpfaff force-pushed the linux-no-socketcall branch from d69f1f5 to 84374e6 Compare May 15, 2025 14:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants