Skip to content

Latest commit

 

History

History
53 lines (27 loc) · 9.34 KB

iii.1.1.1-numeric-data-types.md

File metadata and controls

53 lines (27 loc) · 9.34 KB

III.1.1.1 Numeric data types

  • The CLI only operates on the numeric types int32 (4-byte signed integers), int64 (8-byte signed integers), native int (native-size integers), and F (native-size floating-point numbers). However, the CIL instruction set allows additional data types to be implemented:

  • Short integers: The evaluation stack only holds 4- or 8-byte integers, but other locations (arguments, local variables, statics, array elements, fields) can hold 1- or 2-byte integers. For the purpose of stack operations the bool (§III.1.1.2) and char types are treated as unsigned 1-byte and 2-byte integers respectively. Loading from these locations onto the stack converts them to 4-byte values by:

    • zero-extending for types unsigned int8, unsigned int16, bool and char;

    • sign-extending for types int8 and int16;

    • zero-extends for unsigned indirect and element loads (ldind.u*, ldelem.u*, etc.); and

    • sign-extends for signed indirect and element loads (ldind.i*, ldelem.i*, etc.)

    Storing to integers, booleans, and characters (stloc, stfld, stind.i1, stelem.i2, etc.) truncates. Use the conv.ovf.* instructions to detect when this truncation results in a value that doesn't correctly represent the original value.

    [Note: Short (i.e., 1- and 2-byte) integers are loaded as 4-byte numbers on all architectures and these 4-byte numbers are always tracked as distinct from 8-byte numbers. This helps portability of code by ensuring that the default arithmetic behavior (i.e., when no conv or conv.ovf instruction is executed) will have identical results on all implementations. end note]

    Convert instructions that yield short integer values actually leave an int32 (32-bit) value on the stack, but it is guaranteed that only the low bits have meaning (i.e., the more significant bits are all zero for the unsigned conversions or a sign extension for the signed conversions). To correctly simulate the full set of short integer operations a conversion to a short integer is required before the div, rem, shr, comparison and conditional branch instructions.

    In addition to the explicit conversion instructions there are four cases where the CLI handles short integers in a special way:

    1. Assignment to a local (stloc) or argument (starg) whose type is declared to be a short integer type automatically truncates to the size specified for the local or argument.

    2. Loading from a local (ldloc) or argument (ldarg) whose type is declared to be a short signed integer type automatically sign extends

    3. Calling a procedure with an argument that is a short integer type is equivalent to assignment to the argument value, so it truncates.

    4. Returning a value from a method whose return type is a short integer is modeled as storing into a short integer within the called procedure (i.e., the CLI automatically truncates) and then loading from a short integer within the calling procedure (i.e., the CLI automatically zero- or sign-extends).

    In the last two cases it is up to the native calling convention to determine whether values are actually truncated or extended, as well as whether this is done in the called procedure or the calling procedure. The CIL instruction sequence is unaffected and it is as though the CIL sequence included an appropriate conv instruction.

  • 4-byte integers: The shortest value actually stored on the stack is a 4-byte integer. These can be converted to 8-byte integers or native-size integers using conv.* instructions. Native-size integers can be converted to 4-byte integers, but doing so is not portable across architectures. The conv.i4 and conv.u4 can be used for this conversion if the excess significant bits should be ignored; the conv.ovf.i4 and conv.ovf.u4 instructions can be used to detect the loss of information. Arithmetic operations allow 4-byte integers to be combined with native size integers, resulting in native size integers. 4-byte integers cannot be directly combined with 8-byte integers (they shall be converted to 8-byte integers first).

  • Native-size integers: Native-size integers can be combined with 4-byte integers using any of the normal arithmetic instructions, and the result will be a native-size integer. Native-size integers shall be explicitly converted to 8-byte integers before they can be combined with 8-byte integers.

  • 8-byte integers: Supporting 8-byte integers on 32-bit hardware can be expensive, whereas 32-bit arithmetic is available and efficient on current 64-bit hardware. For this reason, numeric instructions allow int32 and I data types to be intermixed (yielding the largest type used as input), but these types cannot be combined with int64s. Instead, a native int or int32 shall be explicitly converted to int64 before it can be combined with an int64.

  • Unsigned integers: Special instructions are used to interpret integers on the stack as though they were unsigned, rather than tagging the stack locations as being unsigned.

  • Floating-point numbers: See also Partition I, Handling of Floating Point Datatypes. Storage locations for floating-point numbers (statics, array elements, and fields of classes) are of fixed size. The supported storage sizes are float32 and float64. Everywhere else (on the evaluation stack, as arguments, as return types, and as local variables) floating-point numbers are represented using an internal floating-point type. In each such instance, the nominal type of the variable or expression is either float32 or float64, but its value might be represented internally with additional range and/or precision. The size of the internal floating-point representation is implementation-dependent, might vary, and shall have precision at least as great as that of the variable or expression being represented. An implicit widening conversion to the internal representation from float32 or float64 is performed when those types are loaded from storage. The internal representation is typically the natural size for the hardware, or as required for efficient implementation of an operation. The internal representation shall have the following characteristics:

    • The internal representation shall have precision and range greater than or equal to the nominal type.

    • Conversions to and from the internal representation shall preserve value. [Note: This implies that an implicit widening conversion from float32 (or float64) to the internal representation, followed by an explicit conversion from the internal representation to float32 (or float64), will result in a value that is identical to the original float32 (or float64) value.]

    [Note: The above specification allows a compliant implementation to avoid rounding to the precision of the target type on intermediate computations, and thus permits the use of wider precision hardware registers, as well as the application of optimizing transformations (such as contractions), which result in the same or greater precision. Where exactly reproducible behavior precision is required by a language or application (e.g., the Kahan Summation Formula), explicit conversions can be used. Reproducible precision does not guarantee reproducible behavior, however. Implementations with extra precision might round twice: once for the floating-point operation, and once for the explicit conversion. Implementations without extra precision effectively round only once. In rare cases, rounding twice versus rounding once can yield results differing by one unit of least precision. end note]

    When a floating-point value whose internal representation has greater range and/or precision than its nominal type is put in a storage location, it is automatically coerced to the type of the storage location. This might involve a loss of precision or the creation of an out-of-range value (NaN, +infinity, or -infinity). However, the value might be retained in the internal representation for future use, if it is reloaded from the storage location without having been modified. It is the responsibility of the compiler to ensure that the memory location is still valid at the time of a subsequent load, taking into account the effects of aliasing and other execution threads (see memory model section). This freedom to carry extra precision is not permitted, however, following the execution of an explicit conversion (conv.r4 or conv.r8), at which time the internal representation shall be exactly representable in the associated type.

    [Note: To detect values that cannot be converted to a particular storage type, use a conversion instruction (conv.r4, or conv.r8) and then check for an out-of-range value using ckfinite. To detect underflow when converting to a particular storage type, a comparison to zero is required before and after the conversion. end note]

    [Note: This standard does not specify the behavior of arithmetic operations on denormalized floating point numbers, nor does it specify when or whether such representations should be created. This is in keeping with IEC 60559:1989. In addition, this standard does not specify how to access the exact bit pattern of NaNs that are created, nor the behavior when converting a NaN between 32-bit and 64-bit representation. All of this behavior is deliberately left implementation-specific. end note]