-
-
Notifications
You must be signed in to change notification settings - Fork 5.6k
Floating point intrinsics are not IPO :consistent #49353
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Could we make them |
Can someone elaborate more on the true harm of inaccurately marking these For example, I'm imagining the issue would arise in situations like const customNaN = reinterpret(Float64,-1) # !== NaN
f(x) = customNaN === customNaN + x The example function (I will ignore any surprise changes to sign bits in this next paragraph, but the hardware could presumably mess with those too) Is this an example where the |
Could we fix this by redocumenting For example (hopefully nobody ever does this), Base.@assume_effects :consistent f() = rand() would be valid and |
No, there could be interior nans that make the return value non-deterministic.
Yes pretty much.
No, we can't do this for IPO :consistent, but this is pretty close to the non-IPO consistent (option 1). |
As far as I understand, the main problem is when we try to do concrete eval in inference. Roughly speaking, inference asks the question "What are all the possible outputs of this function?". The use case for concrete eval is that if the inputs are all compile-time-known and the function is fully deterministic, then this question can be answered quite easily: There is only one possible output and it's the one you get when you just run the function. The problem for IPO :consistency here is that the LLVM optimizer can end up effectively adding non-determinacy to the function if that function observes, e.g., whether the output of In that case, we told inference that a function would only evaluate to a certain result, but when the program runs, it turns out we were wrong and it evaluates to something else. If inference made important assumptions based on that (e.g. pruning dead code branches), we hit UB. (@Keno please correct me if I got any details wrong) |
That is correct. For posterity, discussion from slack this morning: https://gist.github.com/Keno/ec6e87cd0abffc7d776e1bdf38e0a74c |
Right now we consider all non-fastmath floating point intrinsics (other than muladd, which LLVM may implement as either an fma or a multiply add) as
:consistent
. However, as discussed in https://discourse.llvm.org/t/semantics-of-nan/66729/1, LLVM has underdefined semantics for NaN propagation. In particular, if the inputs to these intrinsics are NaN, LLVM is allowed to return any NaN input or a canonicalized NaN, but makes no semantics guarantee on this. As a result, floating point implement using LLVM semantics is not:consistent
(even if the underlying hardware would be) and we cannot mark it as such.Unfortunately, this would likely lose us a significant fraction of the constant-propagation power that the effect system has brought us. There are a couple of things we can try to recover this:
We have a provision in the effect system for non-IPO effects. These were added with the intention of being used for fastmath floating point intrinsics but apply here too. Currently they are unused, but we should be able to use them to do constant propagation in the optimizer.
We could consider doing some sort of precondition-inference to be able to determine that these functions are in fact consistent as long as the input is not
NaN
(which we can check before constant propagation). I don't quite know how to represent this, but it seems doable.We could consider explicitly normalizing all
NaNs
after every arithmetic operation. Some CPUs (e.g. RISC-V) have these semantics anyway (so it would be free). For others it would potentially introduce additional instructions, but I would expect canonicalization to be reasonably optimizable, since you can generally push it to just before any memory store or other escape. Nevertheless, the potential performance implications are a bit scary.We could consider changing the definition of
===
on floating point numbers to implicitly canonicalize allNaNs
. This would let us avoid doing the canonicalization explicitly, but it would break the invariant thatisbits
types are compared using exact memory comparison, which I think is too much of a trade off to be feasible, though I wanted to list it as an option.The text was updated successfully, but these errors were encountered: