Skip to content

Add initial f16 and f128 support to the s390x backend #10691

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Apr 29, 2025

Conversation

beetrees
Copy link
Contributor

This PR adds initial support for passing f16 and f128 values around to the s390x backend. Support is added for the load, store, bitcast, f16const and f128const CLIF instructions.

Note that the s390x ABI specification currently does not specify the ABI for f16. However, Clang recently added support for f16 in llvm/llvm-project#109164 (as opposed to LLVM just supporting it at the LLVM IR level) using a straightforward extrapolation of the ABI (passing f16 in floating point registers just like f32 and f64), so on that basis I've not put the f16 ABI behind the enable_llvm_abi_extensions setting.

f16/f128 issue: #8312

@beetrees beetrees requested a review from a team as a code owner April 28, 2025 21:43
@beetrees beetrees requested review from cfallin and removed request for a team April 28, 2025 21:43
@cfallin
Copy link
Member

cfallin commented Apr 28, 2025

cc @uweigand -- would you mind reviewing the s390x backend changes here? (Thanks!)

@github-actions github-actions bot added cranelift Issues related to the Cranelift code generator cranelift:area:machinst Issues related to instruction selection and the new MachInst backend. isle Related to the ISLE domain-specific language labels Apr 28, 2025
Copy link

Subscribe to Label Action

cc @cfallin, @fitzgen

This issue or pull request has been labeled: "cranelift", "cranelift:area:machinst", "isle"

Thus the following users have been cc'd because of the following labels:

  • cfallin: isle
  • fitzgen: isle

To subscribe or unsubscribe from this label, edit the .github/subscribe-to-label.json configuration file.

Learn more.

@uweigand
Copy link
Member

Thanks for working on s390x! For f128, current hardware actually has two different ways of representing them - there's the "traditional" way of splitting an f128 value across a pair of 64-bit floating-point registers, but since z14 we can also hold f128 values in a single vector register, with a full set of arithmetic operations available.

This patch is based on the traditional FPR pair approach, but I would much prefer using single vector registers, for several reasons:

  • z14 is the minimum architecture level for cranelift, so the VR-based instructions are always available
  • Handling a single VR is much simpler than handling a pair of FPRs
  • Specifically, while your patch hasn't run into this problem yet, once you actually want to use any of the traditional f128 arithmetic instructions, those only operate on architected pairs (N, N+2), not any two random FPRs. But the cranelift regalloc actually isn't capable of allocating pairs, so those would have to be hardcoded everywhere

Note that the one place where we'd still require pairs (even when using VRs everywhere else) is for function arguments and returns, as the ABI was defined for older systems. But for the ABI we have to hard-code registers anyway, so that should be much less of an issue. (For our own non-system ABIs, we could even chose to pass f128 in VR as well.)

LLVM (and GCC) will also use VRs to hold f128 if you build with -march=z14 or higher.

@beetrees beetrees force-pushed the f16-f128-s390x-mvp branch from 989d9aa to 950b490 Compare April 29, 2025 13:40
@beetrees beetrees force-pushed the f16-f128-s390x-mvp branch from 950b490 to 578707c Compare April 29, 2025 13:40
@beetrees
Copy link
Contributor Author

Done. f128 is now stored in a single vector register.

Note that the one place where we'd still require pairs (even when using VRs everywhere else) is for function arguments and returns, as the ABI was defined for older systems. But for the ABI we have to hard-code registers anyway, so that should be much less of an issue. (For our own non-system ABIs, we could even chose to pass f128 in VR as well.)

I'm unsure what you mean by this; according to the s390x ABI specification f128 is always passed and returned indirectly.

@uweigand
Copy link
Member

Done. f128 is now stored in a single vector register.

Thanks, I'll have a closer look shortly.

I'm unsure what you mean by this; according to the s390x ABI specification f128 is always passed and returned indirectly.

Sorry, I must have gotten confused. You're correct, of course.

@uweigand
Copy link
Member

OK, this all LGTM now. Thanks again!

Copy link
Member

@cfallin cfallin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the review, @uweigand! Giving this an approval based on that, and merging.

@cfallin cfallin added this pull request to the merge queue Apr 29, 2025
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Apr 29, 2025
@cfallin
Copy link
Member

cfallin commented Apr 29, 2025

@alexcrichton it looks like the merge queue checks failed on a diff of a C++ header -- IIRC you had worked on this recently?

@alexcrichton alexcrichton added this pull request to the merge queue Apr 29, 2025
@alexcrichton
Copy link
Member

I think that was a failure to download wasm.hh and it manifested as a weird error, so I'm going to re-queue, but I could very well be wrong...

Merged via the queue into bytecodealliance:main with commit e52ddbd Apr 29, 2025
71 checks passed
@beetrees beetrees deleted the f16-f128-s390x-mvp branch April 29, 2025 19:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cranelift:area:machinst Issues related to instruction selection and the new MachInst backend. cranelift Issues related to the Cranelift code generator isle Related to the ISLE domain-specific language
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants