From ad615db91b6333d45e6d710019e5b14087af2774 Mon Sep 17 00:00:00 2001
From: Ralf Jung <post@ralfj.de>
Date: Sat, 16 Sep 2023 17:13:48 +0200
Subject: [PATCH 1/8] specify NaN behavior more precisely

---
 llvm/docs/LangRef.rst | 36 ++++++++++++++++++++++++++++++------
 1 file changed, 30 insertions(+), 6 deletions(-)
diff --git a/llvm/docs/LangRef.rst b/llvm/docs/LangRef.rst
index f542e70bcfee8..021e7ba2fb417 100644
--- a/llvm/docs/LangRef.rst
+++ b/llvm/docs/LangRef.rst
@@ -3394,17 +3394,41 @@ Floating-Point Environment
 The default LLVM floating-point environment assumes that traps are disabled and
 status flags are not observable. Therefore, floating-point math operations do
 not have side effects and may be speculated freely. Results assume the
-round-to-nearest rounding mode.
+round-to-nearest rounding mode, and subnormals are assumed to be preserved.
+Running default LLVM code in an environment where these assumptions are not met
+can lead to undefined behavior.
+
+The representation bits of a floating-point value do not mutate arbitrarily; if
+there is no floating-point operation being performed, the NaN payload (if any)
+is preserved.
+
+When a floating-point math operation produces a NaN value, the result has a
+non-deterministic sign. The payload is non-deterministically chosen from the
+following set:
+
+- The payload that is all-zero except that the ``quiet`` bit is set.
+  ("Preferred NaN" case)
+- The payload of any input operand that is a NaN, bit-wise ORed with a payload that has
+  the ``quiet`` bit set. ("Quieting NaN propagation" case)
+- The payload of any input operand that is a NaN. ("Unchanged NaN propagation" case)
+- A target-specific set of further NaN payloads, that definitely all have their
+  ``quiet`` bit set. The set can depend on the payloads of the input NaNs.
+  This set is empty on x86 and ARM, but can be non-empty on other architectures.
+  (For instance, on wasm, if any input NaN is not the preferred NaN, then
+  this set contains all quiet NaNs; otherwise, it is empty.
+  On SPARC, this set consists of the all-one payload.)
+
+In particular, if all input NaNs are quiet, then the output NaN is definitely
+quiet. Signaling NaN outputs can only occur if they are provided as an input
+value. For example, "fmul SNaN, 1.0" may be simplified to SNaN rather than QNaN.
 
 Floating-point math operations are allowed to treat all NaNs as if they were
-quiet NaNs. For example, "pow(1.0, SNaN)" may be simplified to 1.0. This also
-means that SNaN may be passed through a math operation without quieting. For
-example, "fmul SNaN, 1.0" may be simplified to SNaN rather than QNaN. However,
-SNaN values are never created by math operations. They may only occur when
-provided as a program input value.
+quiet NaNs. For example, "pow(1.0, SNaN)" may be simplified to 1.0.
 
 Code that requires different behavior than this should use the
 :ref:`Constrained Floating-Point Intrinsics <constrainedfp>`.
+In particular, constrained intrinsics rule out the "Unchanged NaN propagation" case;
+they are guaranteed to return a QNaN.
 
 .. _fastmath:
 

From 91f5076c12ac5d44e807130175227f6efb301e70 Mon Sep 17 00:00:00 2001
From: Ralf Jung <post@ralfj.de>
Date: Tue, 19 Sep 2023 08:50:41 +0200
Subject: [PATCH 2/8] clarify which part is the payload and which operations
 are not affected; add list of known-broken cases

---
 llvm/docs/LangRef.rst | 71 +++++++++++++++++++++++++++++++------------
 1 file changed, 52 insertions(+), 19 deletions(-)

diff --git a/llvm/docs/LangRef.rst b/llvm/docs/LangRef.rst
index 021e7ba2fb417..57812aa4e7b26 100644
--- a/llvm/docs/LangRef.rst
+++ b/llvm/docs/LangRef.rst
@@ -3398,25 +3398,44 @@ round-to-nearest rounding mode, and subnormals are assumed to be preserved.
 Running default LLVM code in an environment where these assumptions are not met
 can lead to undefined behavior.
 
-The representation bits of a floating-point value do not mutate arbitrarily; if
-there is no floating-point operation being performed, the NaN payload (if any)
-is preserved.
+Code that requires different behavior than this should use the
+:ref:`Constrained Floating-Point Intrinsics <constrainedfp>`.
+
+.. _floatnan:
+
+Behavior of Floating-Point NaN values
+-------------------------------------
+
+A floating-point NaN value consists of a sign bit, a quiet/signaling bit, and a
+payload (which makes up the rest of the mantissa except for the quiet/signaling
+bit). LLVM assumes that the quiet/signaling bit being set to ``1`` indicates a
+quiet NaN (QNan), and a value of ``0`` indicates a signaling NaN (SNaN). In the
+following we will hence just call it the "quiet bit"
+
+The representation bits of a floating-point value do not mutate arbitrarily; in
+particular, if there is no floating-point operation being performed, NaN signs,
+quiet bits, and payloads are preserved.
+
+For the purpose of this section, ``bitcast`` as well as the following operations
+are not "floating-point math operations": ``fneg``, ``llvm.fabs``, and
+``llvm.copysign``. They act directly on the underlying bit representation and
+never change anything except for the sign bit.
 
 When a floating-point math operation produces a NaN value, the result has a
-non-deterministic sign. The payload is non-deterministically chosen from the
-following set:
-
-- The payload that is all-zero except that the ``quiet`` bit is set.
-  ("Preferred NaN" case)
-- The payload of any input operand that is a NaN, bit-wise ORed with a payload that has
-  the ``quiet`` bit set. ("Quieting NaN propagation" case)
-- The payload of any input operand that is a NaN. ("Unchanged NaN propagation" case)
-- A target-specific set of further NaN payloads, that definitely all have their
-  ``quiet`` bit set. The set can depend on the payloads of the input NaNs.
-  This set is empty on x86 and ARM, but can be non-empty on other architectures.
-  (For instance, on wasm, if any input NaN is not the preferred NaN, then
-  this set contains all quiet NaNs; otherwise, it is empty.
-  On SPARC, this set consists of the all-one payload.)
+non-deterministic sign. The quiet bit and payload are non-deterministically
+chosen from the following set of options:
+
+- The quiet bit is set and the payload is all-zero. ("Preferred NaN" case)
+- The quiet bit is set and the payload is copied from any input operand that is
+  a NaN. ("Quieting NaN propagation" case)
+- The quiet bit and payload are copied from any input operand that is a NaN.
+  ("Unchanged NaN propagation" case)
+- The quiet bit is set and the payload is picked from a target-specific set of
+  further possible NaN payloads. The set can depend on the payloads of the input
+  NaNs. This set is empty on x86 and ARM, but can be non-empty on other
+  architectures. (For instance, on wasm, if any input NaN does not have the
+  preferred all-zero payload, then this set contains all possible payloads;
+  otherwise, it is empty. On SPARC, this set consists of the all-one payload.)
 
 In particular, if all input NaNs are quiet, then the output NaN is definitely
 quiet. Signaling NaN outputs can only occur if they are provided as an input
@@ -3427,8 +3446,22 @@ quiet NaNs. For example, "pow(1.0, SNaN)" may be simplified to 1.0.
 
 Code that requires different behavior than this should use the
 :ref:`Constrained Floating-Point Intrinsics <constrainedfp>`.
-In particular, constrained intrinsics rule out the "Unchanged NaN propagation" case;
-they are guaranteed to return a QNaN.
+In particular, constrained intrinsics rule out the "Unchanged NaN propagation"
+case; they are guaranteed to return a QNaN.
+
+Unfortunately, due to hard-or-impossible-to-fix issues, LLVM violates its own
+specification on some architectures:
+- x86-32 without SSE2 enabled may convert floating-point values to x86_fp80 and
+  back when performing floating-point math operations; this can lead to results
+  with different precision than expected and it can alter NaN values. Since
+  optimizations can make contradiction assumptions, this can lead to arbitrary
+  miscompilations. See `issue #44218
+  <https://github.com/llvm/llvm-project/issues/44218>`_.
+- x86-32 (even with SSE2 enabled) may implicitly perform such a conversion on
+  values returned from a function.
+- Older MIPS versions use the opposite polarity for the quiet/signaling bit, and
+  LLVM does not correctly represent this. See `issue #60796
+  <https://github.com/llvm/llvm-project/issues/60796>`_.
 
 .. _fastmath:
 

From 7029109c70919de4f3bc57f94cae9b4023fe0695 Mon Sep 17 00:00:00 2001
From: Ralf Jung <post@ralfj.de>
Date: Tue, 19 Sep 2023 08:55:38 +0200
Subject: [PATCH 3/8] mention strictfp and denormal-fp-math

---
 llvm/docs/LangRef.rst | 9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/llvm/docs/LangRef.rst b/llvm/docs/LangRef.rst
index 57812aa4e7b26..2a82333a417b8 100644
--- a/llvm/docs/LangRef.rst
+++ b/llvm/docs/LangRef.rst
@@ -3395,11 +3395,12 @@ The default LLVM floating-point environment assumes that traps are disabled and
 status flags are not observable. Therefore, floating-point math operations do
 not have side effects and may be speculated freely. Results assume the
 round-to-nearest rounding mode, and subnormals are assumed to be preserved.
-Running default LLVM code in an environment where these assumptions are not met
-can lead to undefined behavior.
 
-Code that requires different behavior than this should use the
-:ref:`Constrained Floating-Point Intrinsics <constrainedfp>`.
+Running LLVM code in an environment where these assumptions are not met can lead
+to undefined behavior. The ``strictfp`` and ``denormal-fp-math`` attributes as
+well as :ref:`Constrained Floating-Point Intrinsics <constrainedfp>` can be used
+to weaken LLVM's assumptions and ensure defined behavior in non-default
+floating-point environments; see their respective documentation for details.
 
 .. _floatnan:
 

From e27e3ea7b01afcd2e8d05055be9187cf99b636af Mon Sep 17 00:00:00 2001
From: Ralf Jung <post@ralfj.de>
Date: Tue, 19 Sep 2023 17:08:12 +0200
Subject: [PATCH 4/8] fix nits

---
 llvm/docs/LangRef.rst | 17 +++++++++--------
 1 file changed, 9 insertions(+), 8 deletions(-)

diff --git a/llvm/docs/LangRef.rst b/llvm/docs/LangRef.rst
index 2a82333a417b8..ef099cd14835a 100644
--- a/llvm/docs/LangRef.rst
+++ b/llvm/docs/LangRef.rst
@@ -3410,7 +3410,7 @@ Behavior of Floating-Point NaN values
 A floating-point NaN value consists of a sign bit, a quiet/signaling bit, and a
 payload (which makes up the rest of the mantissa except for the quiet/signaling
 bit). LLVM assumes that the quiet/signaling bit being set to ``1`` indicates a
-quiet NaN (QNan), and a value of ``0`` indicates a signaling NaN (SNaN). In the
+quiet NaN (QNaN), and a value of ``0`` indicates a signaling NaN (SNaN). In the
 following we will hence just call it the "quiet bit"
 
 The representation bits of a floating-point value do not mutate arbitrarily; in
@@ -3419,12 +3419,13 @@ quiet bits, and payloads are preserved.
 
 For the purpose of this section, ``bitcast`` as well as the following operations
 are not "floating-point math operations": ``fneg``, ``llvm.fabs``, and
-``llvm.copysign``. They act directly on the underlying bit representation and
-never change anything except for the sign bit.
+``llvm.copysign``. These operations act directly on the underlying bit
+representation and never change anything except possibly for the sign bit.
 
-When a floating-point math operation produces a NaN value, the result has a
-non-deterministic sign. The quiet bit and payload are non-deterministically
-chosen from the following set of options:
+For floating-point math operations, unless specified otherwise, the following
+rules apply when a NaN value is returned: the result has a non-deterministic
+sign; the quiet bit and payload are non-deterministically chosen from the
+following set of options:
 
 - The quiet bit is set and the payload is all-zero. ("Preferred NaN" case)
 - The quiet bit is set and the payload is copied from any input operand that is
@@ -3455,11 +3456,11 @@ specification on some architectures:
 - x86-32 without SSE2 enabled may convert floating-point values to x86_fp80 and
   back when performing floating-point math operations; this can lead to results
   with different precision than expected and it can alter NaN values. Since
-  optimizations can make contradiction assumptions, this can lead to arbitrary
+  optimizations can make contradicting assumptions, this can lead to arbitrary
   miscompilations. See `issue #44218
   <https://github.com/llvm/llvm-project/issues/44218>`_.
 - x86-32 (even with SSE2 enabled) may implicitly perform such a conversion on
-  values returned from a function.
+  values returned from a function for some calling conventions.
 - Older MIPS versions use the opposite polarity for the quiet/signaling bit, and
   LLVM does not correctly represent this. See `issue #60796
   <https://github.com/llvm/llvm-project/issues/60796>`_.

From 62e022bd505ef8fe8d8a38aa4012859adeed062c Mon Sep 17 00:00:00 2001
From: Ralf Jung <post@ralfj.de>
Date: Tue, 19 Sep 2023 18:24:33 +0200
Subject: [PATCH 5/8] the target 'extra' NaNs can depend on the inputs in
 general; clarify what this achieves

---
 llvm/docs/LangRef.rst | 22 +++++++++++++---------
 1 file changed, 13 insertions(+), 9 deletions(-)

diff --git a/llvm/docs/LangRef.rst b/llvm/docs/LangRef.rst
index ef099cd14835a..55386b5303d51 100644
--- a/llvm/docs/LangRef.rst
+++ b/llvm/docs/LangRef.rst
@@ -3433,15 +3433,19 @@ following set of options:
 - The quiet bit and payload are copied from any input operand that is a NaN.
   ("Unchanged NaN propagation" case)
 - The quiet bit is set and the payload is picked from a target-specific set of
-  further possible NaN payloads. The set can depend on the payloads of the input
-  NaNs. This set is empty on x86 and ARM, but can be non-empty on other
-  architectures. (For instance, on wasm, if any input NaN does not have the
-  preferred all-zero payload, then this set contains all possible payloads;
-  otherwise, it is empty. On SPARC, this set consists of the all-one payload.)
-
-In particular, if all input NaNs are quiet, then the output NaN is definitely
-quiet. Signaling NaN outputs can only occur if they are provided as an input
-value. For example, "fmul SNaN, 1.0" may be simplified to SNaN rather than QNaN.
+  "extra" possible NaN payloads. The set can depend on the input operand values.
+  This set is empty on x86 and ARM, but can be non-empty on other architectures.
+  (For instance, on wasm, if any input NaN does not have the preferred all-zero
+  payload or any input NaN is an SNaN, then this set contains all possible
+  payloads; otherwise, it is empty. On SPARC, this set consists of the all-one
+  payload.)
+
+In particular, if all input NaNs are quiet (or if there are no input NaNs), then
+the output NaN is definitely quiet. Signaling NaN outputs can only occur if they
+are provided as an input value. For example, "fmul SNaN, 1.0" may be simplified
+to SNaN rather than QNaN. Similarly, if all input NaNs are preferred (or if
+there are no input NaNs) and the target does not have any "extra" NaN payloads,
+then the output NaN is guaranteed to be preferred.
 
 Floating-point math operations are allowed to treat all NaNs as if they were
 quiet NaNs. For example, "pow(1.0, SNaN)" may be simplified to 1.0.

From 83f182abbe1118ce4e9da13dbc117e9015c67fa9 Mon Sep 17 00:00:00 2001
From: Ralf Jung <post@ralfj.de>
Date: Thu, 21 Sep 2023 09:48:46 +0200
Subject: [PATCH 6/8] link to issue for x86-32 with SSE calling convention
 problem

---
 llvm/docs/LangRef.rst | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/llvm/docs/LangRef.rst b/llvm/docs/LangRef.rst
index 55386b5303d51..2200757201c9b 100644
--- a/llvm/docs/LangRef.rst
+++ b/llvm/docs/LangRef.rst
@@ -3464,7 +3464,8 @@ specification on some architectures:
   miscompilations. See `issue #44218
   <https://github.com/llvm/llvm-project/issues/44218>`_.
 - x86-32 (even with SSE2 enabled) may implicitly perform such a conversion on
-  values returned from a function for some calling conventions.
+  values returned from a function for some calling conventions. See `issue
+  #66803 <https://github.com/llvm/llvm-project/issues/66803>`_.
 - Older MIPS versions use the opposite polarity for the quiet/signaling bit, and
   LLVM does not correctly represent this. See `issue #60796
   <https://github.com/llvm/llvm-project/issues/60796>`_.

From 670b08c4ba609b671676d249331afd2216defc35 Mon Sep 17 00:00:00 2001
From: Ralf Jung <post@ralfj.de>
Date: Thu, 21 Sep 2023 09:49:23 +0200
Subject: [PATCH 7/8] fneg, fabs, copysign: explicitly state that NaN quietness
 and payload are preserved

---
 llvm/docs/LangRef.rst | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/llvm/docs/LangRef.rst b/llvm/docs/LangRef.rst
index 2200757201c9b..3a3f5fa090675 100644
--- a/llvm/docs/LangRef.rst
+++ b/llvm/docs/LangRef.rst
@@ -9149,6 +9149,9 @@ Semantics:
 """"""""""
 
 The value produced is a copy of the operand with its sign bit flipped.
+The value is otherwise completely identical; in particular, if the input is a
+NaN, then the quiet/signaling bit and payload are perfectly preserved.
+
 This instruction can also take any number of :ref:`fast-math
 flags <fastmath>`, which are optimization hints to enable otherwise
 unsafe floating-point optimizations:
@@ -15156,6 +15159,9 @@ Semantics:
 
 This function returns the same values as the libm ``fabs`` functions
 would, and handles error conditions in the same way.
+The returned value is completely identical to the input except for the sign bit;
+in particular, if the input is a NaN, then the quiet/signaling bit and payload
+are perfectly preserved.
 
 .. _i_minnum:
 
@@ -15371,6 +15377,9 @@ Semantics:
 
 This function returns the same values as the libm ``copysign``
 functions would, and handles error conditions in the same way.
+The returned value is completely identical to the first operand except for the
+sign bit; in particular, if the input is a NaN, then the quiet/signaling bit and
+payload are perfectly preserved.
 
 .. _int_floor:
 

From c14faf31b0d63a437aaaf7c421c71c14ef90d7d4 Mon Sep 17 00:00:00 2001
From: Ralf Jung <post@ralfj.de>
Date: Thu, 21 Sep 2023 17:38:25 +0200
Subject: [PATCH 8/8] fptrunc and fpext

---
 llvm/docs/LangRef.rst | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/llvm/docs/LangRef.rst b/llvm/docs/LangRef.rst
index 3a3f5fa090675..89a4a44b7d369 100644
--- a/llvm/docs/LangRef.rst
+++ b/llvm/docs/LangRef.rst
@@ -11307,6 +11307,11 @@ The '``fptrunc``' instruction casts a ``value`` from a larger
 This instruction is assumed to execute in the default :ref:`floating-point
 environment <floatenv>`.
 
+NaN values follow the usual :ref:`NaN behaviors <floatnan>`, except that _if_ a
+NaN payload is propagated from the input ("Quieting NaN propagation" or
+"Unchanged NaN propagation" cases), then the low order bits of the NaN payload
+which cannot fit in the resulting type are discarded.
+
 Example:
 """"""""
 
@@ -11347,6 +11352,11 @@ The '``fpext``' instruction extends the ``value`` from a smaller
 *no-op cast* because it always changes bits. Use ``bitcast`` to make a
 *no-op cast* for a floating-point cast.
 
+NaN values follow the usual :ref:`NaN behaviors <floatnan>`, except that _if_ a
+NaN payload is propagated from the input ("Quieting NaN propagation" or
+"Unchanged NaN propagation" cases), then it is copied to the high order bits of
+the resulting payload, and the remaining low order bits are zero.
+
 Example:
 """"""""