Skip to content

Fix UB in 128 bit cttz/ctlz intrinsics #672

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

FractalFir
Copy link
Contributor

Fixes #604.

This PR changes a few things.

First of all, it enables UB checks for tests in debug mode: This should allow us to catch any regressions.

Second, I extracted the common implementation of the "safe" ctlz/cttz intrinsics into a separate function.

This makes handling the 128 bit edge case a bit easier, and allows us to use those intrinsics.

The third change modifies the 128 bit ctlz/cttz emulation to use the "safe" version of the 64 bit intrinsic, instead of calling the GCC built-in directly.

Effectively, this turns a piece of IR like this:

// Get the ctlz for the low and high half of the 128 bit integer
let ctlz_low = ctlz_nonzero(low);
let ctlz_high = ctlz_nonzero(high);

Into a piece of code like this:

// Get the ctlz for the low and high half of the 128 bit integer
let ctlz_low = if low != 0 {ctlz_nonzero(low)} else {64};
let ctlz_high = if high != 0 {ctlz_nonzero(high)} else {64};

This solution is not the prettiest(maybe a few less branches could be used?), but it is UB-free, and relatively easy to implement.

@antoyo
Copy link
Contributor

antoyo commented May 11, 2025

Thanks for sending this PR.

How does the generated code compare to the fix made in this PR?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

UB in the implementation of the 128 bit ctlz intrinsic
2 participants