From 292b0228dd17f63b572831c8b0f57a4fe0344abc Mon Sep 17 00:00:00 2001 From: gnzlbg Date: Fri, 12 Jul 2019 17:34:00 +0200 Subject: [PATCH 1/8] Initial commit: target-feature-runtime RFC --- text/0000-target-feature-runtime.md | 271 ++++++++++++++++++++++++++++ 1 file changed, 271 insertions(+) create mode 100644 text/0000-target-feature-runtime.md diff --git a/text/0000-target-feature-runtime.md b/text/0000-target-feature-runtime.md new file mode 100644 index 00000000000..56e7a934f7f --- /dev/null +++ b/text/0000-target-feature-runtime.md @@ -0,0 +1,271 @@ +- Feature Name: `target_feature_runtime` +- Start Date: 2019-07-12 +- RFC PR: [rust-lang/rfcs#0000](https://github.com/rust-lang/rfcs/pull/0000) +- Rust Issue: [rust-lang/rust#0000](https://github.com/rust-lang/rust/issues/0000) + +# Summary +[summary]: #summary + +This RFC allows `#![no_std]` binaries and libraries (e.g. like `libcore` and +`liballoc`) to perform run-time feature detection. + +# Motivation +[motivation]: #motivation + +Binaries and libraries using the `std` library can perform run-time feature +detection via the `is_x86_feature_detected!("avx")` architecture-specific +macros. + +This operation requires, in general, operating system support, and is therefore +not available in `libcore`, which is operating system agnostic. + +That is, `#![no_std]` libraries, like `liballoc` and `libcore`, cannot perform +run-time feature detection, even though `libstd` often ends up being linked into +the final binary. + +This results in some crates in crates.io having much better performance than the +methods of the types provided by `libcore`, like `&str`, `[T]`, `Iterator`, etc. + +One example is the `is_sorted` crate, which provides an implementation of +`Iterator::is_sorted`, which performs 16x better than the `libcore` +implementation by using AVX. Another example include the `memchr` crate, as well +as crates implementing algorithms to compute whether a `[u8]` is an ASCII +string, or an UTF-8 string. These perform on the ballpark of about 1.6x better +than the `libcore` implementations, by using AVX on x86. + +For `#![no_std]` binaries, the standard library is not linked into the final +binary, and they cannot use any library that uses the runtime feature detection +macros, because they are not available. + +The goal of this RFC is to enable `#![no_std]` libraries and binaries to perform +run-time feature detection. + +# Constraints on the design + +It helps to first enumerate the "self-imposed" constraints on the design: + +* **zero-cost abstraction**: it shouldn't be possible to do runtime feature + detection better than via the APIs provided here. +* **don't pay for what you don't use**: programs that don't need to do any + runtime feature detection should not pay anything for it, in terms of costs in + binary size, memory usage, time spent on binary initialization, etc. +* **99.9%** The majority of Rust users should be able to benefit from this, + e.g., via `libcore` using it, without having to know that this exists. +* **reliable**: it should be possible to reliably do run-time feature detection, + since it is required to prove that some `unsafe` code is actually safe. +* **portable libraries**: libraries that use run-time feature detection should + be able to do so, without restricting which users can use the library. +* **cross-domain**: operating system kernels and user-space applications often + need to do run-time feature detection in very different ways - all use cases + should be supported. +* **cdylibs**: dynamic libraries should be able to do run-time feature detection. +* **embedded systems**: binaries running on read-only memory (e.g. in a ROM) + should be able to do runtime feature detection. + +These constraints motivate the design. + +# Guide-level explanation +[guide-level-explanation]: #guide-level-explanation + +Users can continue to perform run-time feature detection by using the +`is_{architecture}_feature_detected!` macros. These macros were previously only +available from libstd, and are now available in `libcore`. That is, `#![no_std]` +libraries and binaries can use them. + +Users can now provide their own target-feature detection run-time: + +```rust +#[target_feature_detection_runtime] +static TargetFeatureRT: impl core::detect::Runtime; +``` + +by using the `#[target_feature_detection_runtime]` attribute on a `static` +variable of a type that implements the `core::detect::Runtime` `trait` (see +[definition below][runtime-trait]). + +This is analogous to how the `#[global_allocator]` is currently defined in Rust +programs. + +# Reference-level explanation +[reference-level-explanation]: #reference-level-explanation + +This RFC introduces: + +* a new attribute: `#[target_feature_detection_runtime]`, +* a new trait: `core::detect::Runtime`, and +* a new function: `core::detect::is_target_feature_detected`. + +## The `#[target_feature_detection_runtime]` attribute + +The `#[target_feature_runtime]` can be used to _define_ a target-feature +detection run-time by applying it to a `static` variable as follows: + +```rust +#[target_feature_detection_runtime] +static TargetFeatureRT: impl core::detect::Runtime; +``` + +Only one such definition is allowed per binary artifact (binary, cdylib, etc.), +similarly to how only one `#[global_allocator]` or `#[panic_handler]` is +allowed in the dependency graph. + +The `static` variable must implement the `core::detect::Runtime` `trait`. + +If no `#[target_feature_detection_runtime]` is provided anywhere in the +dependency graph, Rust provides a default definition. For `#![no_std]` binaries +and dynamic libraries, that is, for binaries and libraries that do not link +against `libstd`, this definition always returns `false` (it does nothing). + +## The `core::detect::Runtime` trait +[runtime-trait]: #runtime-trait + +The runtime must be a `static` variable of a type that implements the +`core::detect::Runtime` trait: + +```rust +unsafe trait core::detect::Runtime { + /// Returns `true` if the `feature` is known to be supported by the + /// current thread of execution and `false` otherwise. + #[rustc_const_function_arg(0)] + fn is_target_feature_detected(feature: &'static str) -> bool; +} +``` + +This `trait`, which is part of `libcore`, is `unsafe` to implement. A correct +implementation, satisfying the specified semantics of its methods is required +for soundness of safe Rust code. That is, an incorrect implementation can cause +safe Rust code to have undefined behavior. + +Forcing the `&'static str` to be a constant expression allows the +feature-detection macros to reliably produce compilation-errors on unknown +features, as well as on features that have not been stabilized yet. This type of +validation happens at compile-time, before the user-defined run-time is called. + +## The `core::detect::is_target_feature_detected` function + +Finally, the following function is added to `libcore`: + +```rust +/// Returns `true` if the `feature` is known to be supported by the +/// current thread of execution and `false` otherwise. +#[rustc_const_function_arg(0)] +fn is_target_feature_detected(feature: &'static str) -> bool; +``` + +This function calls the `Runtime::is_target_feature_detected` method. Its +argument must be a constant-expression. + +--- + +Finally, this RFC moves the feature-detection macros of `libstd` to `libcore`. +Right now, the only stable feature-detection macro is +`is_x86_feature_detected!("target_feature_name")`. + +The semantics of these macros are modified to: + +```rust +/// Returns `true` if `cfg!(target_feature = feature)` is `true`, and +/// returns the value of `core::detect::is_feature_detected(feature)` +/// otherwise. +/// +/// If `feature` is not known to be a valid feature for the current +/// `architecture`, the program is ill-formed, and a compile-time +/// diagnostic is emitted. +is_{architecture}_feature_detected!(feature: &'static str) -> bool; +``` + +# Drawbacks +[drawbacks]: #drawbacks + +This increases the complexity of the implementation, adding another singleton +run-time component. + +# Rationale and alternatives +[rationale-and-alternatives]: #rationale-and-alternatives + +## Rationale + +This approach satisfies all self-imposed constraints: + +* **zero-cost abstraction**: the APIs provided just call the run-time. If the + user can do better than, e.g., the run-time provided by Rust, they can just + override it with their own. + +* **don't pay for what you don't use**: programs that never do run-time feature + detection, never call any of the APIs. LTO should be able to optimize the + run-time away. If it isn't, users can provide their own "empty" run-time. + +* **99.9%** This enables `libcore`, `liballoc`, and `#![no_std]` libraries in + general to do run-time feature detection. The majority of Rust users, benefits + from that silently even though they might never use this feature themselves. + +* **reliable**: the default `#![no_std]` run-time provided by Rust always + returns `false`, that is, that a feature is not enabled, such that the + run-time feature detection macros will return `true` only for the features + enabled at compile-time; this is always correct. The `Runtime` trait is also + `unsafe` to implement. + +* **portable libraries**: libraries that use run-time feature detection are not + restricted to `#![std]` binaries anymore - they can be used by `#![no_std]` + libraries and binaries as well. + +* **cross-domain**: the run-time provided by Rust by default requires operating + system support, that is, for custom targets, no run-time will be provided. + Users of these targets can use any run-time that satisfies their constraints. + +* **cdylibs**: dynamic libraries get the same default run-time as Rust binaries, + i.e., the `libstd` one if `libstd` is linked, and one that returns `false` if + the `cdylib` is `#![no_std]`, in which case the `cdylib` can provide their + own. + +* **embedded systems**: binaries running on read-only memory (e.g. in a ROM) can + implement a run-time that, e.g., does not cache any results, which would + require read-write memory, and instead, recomputes results on all invocations, + always returns false, contains features for different CPUs pre-computed in + read-only memory, and only detects the CPU type, etc. Even when implementing a + feature cache, one often needs to choose between using atomics, thread-locals, + mutexes, or no synchronization if the application is single-threaded. Not all + embedded systems support all these features. + +## Alternatives + +We could not solve this problem. In which case, `libcore` can't use run-time +feature detection, e.g., to use advanced SIMD instructions. + +We also could do something different. For example, we could provide a "cache" in +libcore, and an API for users or only for the standard library, to initialize +this cache externally, e.g., during the standard library initialization routine. + +This runs into problems with `cdylibs`, where these routines might not be +called. It also runs into problems with often imposing a cost on users, e.g., +due to a cache in libcore, even though users might never use it. This would be +limiting, if e.g. having a cache in read-write memory prevents libcore from +being compiled to a read-only binary. We would need to feature gate this +functionality to avoid these issues. + +It isn't cross-domain either, e.g., an OS kernel would need to disable this +functionality, and wouldn't be able to provide their own. So while they could +use libraries that would do run-time feature detection, no meaningful detection +would be performed. + +# Prior art +[prior-art]: #prior-art + +This feature is very similar to `#[global_allocator]` and `#[panic_handler]`. +Since a default implementation is provided if the user does not provide one, +this is a backward compatible change. + +This feature does not exist in any programming languages I know. Clang and GCC +do have a feature-detection run-time, which is not configurable, nor does it +work for all users. + +# Unresolved questions +[unresolved-questions]: #unresolved-questions + +None. + +# Future possibilities +[future-possibilities]: #future-possibilities + +None. After this RFC, the run-time feature detection part of the Rust language +should be complete. From 59463e1c13aa890f48b42555061bbe27503064ac Mon Sep 17 00:00:00 2001 From: gnzlbg Date: Sat, 13 Jul 2019 15:13:07 +0200 Subject: [PATCH 2/8] Add example, use non-exhaustive enum, constraints->use cases --- text/0000-target-feature-runtime.md | 262 ++++++++++++++++++---------- 1 file changed, 168 insertions(+), 94 deletions(-) diff --git a/text/0000-target-feature-runtime.md b/text/0000-target-feature-runtime.md index 56e7a934f7f..32f487d3961 100644 --- a/text/0000-target-feature-runtime.md +++ b/text/0000-target-feature-runtime.md @@ -40,29 +40,41 @@ macros, because they are not available. The goal of this RFC is to enable `#![no_std]` libraries and binaries to perform run-time feature detection. -# Constraints on the design - -It helps to first enumerate the "self-imposed" constraints on the design: - -* **zero-cost abstraction**: it shouldn't be possible to do runtime feature - detection better than via the APIs provided here. -* **don't pay for what you don't use**: programs that don't need to do any - runtime feature detection should not pay anything for it, in terms of costs in - binary size, memory usage, time spent on binary initialization, etc. -* **99.9%** The majority of Rust users should be able to benefit from this, - e.g., via `libcore` using it, without having to know that this exists. -* **reliable**: it should be possible to reliably do run-time feature detection, - since it is required to prove that some `unsafe` code is actually safe. -* **portable libraries**: libraries that use run-time feature detection should - be able to do so, without restricting which users can use the library. -* **cross-domain**: operating system kernels and user-space applications often - need to do run-time feature detection in very different ways - all use cases - should be supported. -* **cdylibs**: dynamic libraries should be able to do run-time feature detection. -* **embedded systems**: binaries running on read-only memory (e.g. in a ROM) - should be able to do runtime feature detection. - -These constraints motivate the design. +However, `#![no_std]` libraries and binaries are used in a wider-range of +applications than `#![std]` libraries ones, and they might often want to perform +run-time feature detection differently. Among others: + +* **user-space applications**: performing run-time feature detection often + requires executing privileged CPU instructions that are illegal to execute + from user-space code. User-space applications query the available + target-feature set from the operating system. Often, they might also want to + cache the result to avoid repeating system calls. + +* **privileged applications**: operating-system kernels, embedded applications, + etc. are often able to execute privileged CPU instructions, and they have no + "OS" they can query available features from. They are also often subjected to + additional constraints. For example, they might not want to use certain + features, like floating point or SIMD registers, to avoid saving them on + context switches, or a feature cache that's modified at run-time, to allow + them to run on read-only memory, e.g., on ROM. They are also limited on how to + implement a feature cache, depending on the availability of atomic + instructions, mutexes, thread local, and many of these applications are + actually single-threaded, so they should be able to implement a cache without + any synchronization at all. + +* **cdylibs**: dynamically-linked Rust libraries with a C ABI cannot often + perform any sort of initialization at link-time. That is, they should be able + to initialize their target-feature cache, if they have one, on first use. + +Libraries use run-time feature detection to prove that some `unsafe` code is +safe. So it is crucial that users can easily implement feature-detection +run-times that are correct. + +On top of these constraints, we impose the classical constraints on new Rust +features. This must be a zero-cost abstraction, that all Rust code can just use, +without any "but"s. Also, applications that do not perform any run-time feature +detection should not pay any price for it. This includes no run-time or +initialization overhead, no extra memory usage, and no code-size or binary size. # Guide-level explanation [guide-level-explanation]: #guide-level-explanation @@ -76,23 +88,53 @@ Users can now provide their own target-feature detection run-time: ```rust #[target_feature_detection_runtime] -static TargetFeatureRT: impl core::detect::Runtime; +static TargetFeatureRT: impl core::detect::TargetFeatureRuntime; ``` by using the `#[target_feature_detection_runtime]` attribute on a `static` -variable of a type that implements the `core::detect::Runtime` `trait` (see +variable of a type that implements the `core::detect::TargetFeatureRuntime` `trait` (see [definition below][runtime-trait]). This is analogous to how the `#[global_allocator]` is currently defined in Rust programs. +For example, an embedded application running on `aarch64`, can implement a +run-time as follows to detect some target-feature without caching them: + +```rust +struct Runtime; +unsafe impl core::detect::TargetFeatureRuntime for Runtime { + fn is_feature_detected(feature: core::detect::TargetFeature) -> bool { + // note: `TargetFeature` is a `#[non_exhaustive]` enum. + use core::detect::TargetFeature; + + // note: `mrs` is a privileged instruction: + match feature { + Aes => { + let aa64isar0: u64; // Instruction Set Attribute Register 0 + unsafe { asm!("mrs $0, ID_AA64ISAR0_EL1" : "=r"(aa64isar0)); } + bits_shift(aa64isar0, 7, 4) >= 1 + }, + Asimd => { + let aa64pfr0: u64; // Processor Feature Register 0 + unsafe { asm!("mrs $0, ID_AA64PFR0_EL1" : "=r"(aa64pfr0)); } + bits_shift(aa64pfr0, 23, 20) < 0xF + }, + // features that we don't detect are reported as "disabled": + _ => false, + } + } +} +``` + # Reference-level explanation [reference-level-explanation]: #reference-level-explanation This RFC introduces: * a new attribute: `#[target_feature_detection_runtime]`, -* a new trait: `core::detect::Runtime`, and +* a new trait: `core::detect::TargetFeatureRuntime`, +* a new enum: `core::detect::TargetFeature`, and * a new function: `core::detect::is_target_feature_detected`. ## The `#[target_feature_detection_runtime]` attribute @@ -102,32 +144,33 @@ detection run-time by applying it to a `static` variable as follows: ```rust #[target_feature_detection_runtime] -static TargetFeatureRT: impl core::detect::Runtime; +static TargetFeatureRT: impl core::detect::TargetFeatureRuntime; ``` Only one such definition is allowed per binary artifact (binary, cdylib, etc.), similarly to how only one `#[global_allocator]` or `#[panic_handler]` is allowed in the dependency graph. -The `static` variable must implement the `core::detect::Runtime` `trait`. +The `static` variable must implement the `core::detect::TargetFeatureRuntime` +`trait`. If no `#[target_feature_detection_runtime]` is provided anywhere in the dependency graph, Rust provides a default definition. For `#![no_std]` binaries and dynamic libraries, that is, for binaries and libraries that do not link against `libstd`, this definition always returns `false` (it does nothing). -## The `core::detect::Runtime` trait +## The `core::detect::TargetFeatureRuntime` trait [runtime-trait]: #runtime-trait -The runtime must be a `static` variable of a type that implements the -`core::detect::Runtime` trait: +The run-time must be a `static` variable of a type that implements the +`core::detect::TargetFeatureRuntime` trait: ```rust -unsafe trait core::detect::Runtime { +unsafe trait core::detect::TargetFeatureRuntime { /// Returns `true` if the `feature` is known to be supported by the /// current thread of execution and `false` otherwise. - #[rustc_const_function_arg(0)] - fn is_target_feature_detected(feature: &'static str) -> bool; + fn is_target_feature_detected(feature: core::detect::TargetFeature) + -> bool; } ``` @@ -136,10 +179,27 @@ implementation, satisfying the specified semantics of its methods is required for soundness of safe Rust code. That is, an incorrect implementation can cause safe Rust code to have undefined behavior. -Forcing the `&'static str` to be a constant expression allows the -feature-detection macros to reliably produce compilation-errors on unknown -features, as well as on features that have not been stabilized yet. This type of -validation happens at compile-time, before the user-defined run-time is called. +## The `core::detect::TargetFeature` enum + +A `#[non_exhaustive]` `enum` is added to the `core::detect` module: + +```rust +#[non_exhaustive] enum TargetFeature { ... } +``` + +> Unresolved question: should this `enum` be in `core::arch::{arch}` ? + +The variants of this `enum` are architecture-specific, and adding new variants +to the `enum` is a forward-compatible change. + +Each enum variant is named as a target-feature of the target, where the +target-feature strings accepted by the run-time feature detection macros are +mapped to variants by capitalizing their first letter. + +For example, `is_x86_feature_detected!("avx")` corresponds to +`TargetFeature::Avx`. Variants corresponding to unstable target-features are +gated behind their feature flag. For example, using `TargetFeature::Avx512f` +requires enabling `feature(avx512_target_feature)`. ## The `core::detect::is_target_feature_detected` function @@ -148,12 +208,11 @@ Finally, the following function is added to `libcore`: ```rust /// Returns `true` if the `feature` is known to be supported by the /// current thread of execution and `false` otherwise. -#[rustc_const_function_arg(0)] -fn is_target_feature_detected(feature: &'static str) -> bool; +fn is_target_feature_detected(feature: core::detect::TargetFeature) -> bool; ``` -This function calls the `Runtime::is_target_feature_detected` method. Its -argument must be a constant-expression. +This function calls the `TargetFeatureRuntime::is_target_feature_detected` +method. --- @@ -164,16 +223,22 @@ Right now, the only stable feature-detection macro is The semantics of these macros are modified to: ```rust -/// Returns `true` if `cfg!(target_feature = feature)` is `true`, and -/// returns the value of `core::detect::is_feature_detected(feature)` +/// Returns `true` if `cfg!(target_feature = string-literal)` is `true`, and +/// returns the value of `core::detect::is_feature_detected` for the feature /// otherwise. /// /// If `feature` is not known to be a valid feature for the current -/// `architecture`, the program is ill-formed, and a compile-time -/// diagnostic is emitted. -is_{architecture}_feature_detected!(feature: &'static str) -> bool; +/// `architecture`, or the required `feature()` gates to use the feature are +/// not enabled, the program is ill-formed, and a compile-time diagnostic is +/// emitted. +is_{architecture}_feature_detected!(string-literal) -> bool; ``` +> Implementation note: currently, the compilation-errors are emitted by the +> macro by pattern-matching on the literals. The mapping from the literals to +> the variants of the `TargetFeature` enum happens also at compile-time by +> pattern matching the literals. + # Drawbacks [drawbacks]: #drawbacks @@ -185,63 +250,68 @@ run-time component. ## Rationale -This approach satisfies all self-imposed constraints: - -* **zero-cost abstraction**: the APIs provided just call the run-time. If the - user can do better than, e.g., the run-time provided by Rust, they can just - override it with their own. +This approach satisfies all considered use-cases: + +* `libcore`, `liballoc` and other `#![no_std]` libraries and applications can + just use the run-time feature detection macros to use extra CPU features, when + available. This will happen automatically if a meaningful run-time is linked, + but will not introduce unsoundness if no run-time is available, since all + features are then reported as disabled. + +* **user-space applications**: can implement run-times that query the operating + system for features, or use CPU instructions for those architectures in which + they are not privileged. They can cache the results in various ways, or + disable target-feature detection completely if they so desired, e.g., by + providing a run-time that always returns false. By default, `libstd` will + provide a run-time that's meaningful for user-space, such that these + applications don't have to do anything, and such that their `#![no_std]` + dependencies like `libcore` can perform run-time feature detection. -* **don't pay for what you don't use**: programs that never do run-time feature - detection, never call any of the APIs. LTO should be able to optimize the - run-time away. If it isn't, users can provide their own "empty" run-time. - -* **99.9%** This enables `libcore`, `liballoc`, and `#![no_std]` libraries in - general to do run-time feature detection. The majority of Rust users, benefits - from that silently even though they might never use this feature themselves. - -* **reliable**: the default `#![no_std]` run-time provided by Rust always - returns `false`, that is, that a feature is not enabled, such that the - run-time feature detection macros will return `true` only for the features - enabled at compile-time; this is always correct. The `Runtime` trait is also - `unsafe` to implement. - -* **portable libraries**: libraries that use run-time feature detection are not - restricted to `#![std]` binaries anymore - they can be used by `#![no_std]` - libraries and binaries as well. +* **privileged applications**: OS kernels and embedded applications can provide + a run-time that satisfies their use case and constraints. + +* **cdylibs**: dynamic libraries linked against the standard library get by + default the `libstd` run-time. If these are `#![no_std]`, but have access to + system APIs, e.g., via `libc`, they might be able to just include the `libstd` + run-time from crates.io, without having to depend on `libstd` itself. + Otherwise, they can use their knowledge of the target they are running on to + implement their own run-time. -* **cross-domain**: the run-time provided by Rust by default requires operating - system support, that is, for custom targets, no run-time will be provided. - Users of these targets can use any run-time that satisfies their constraints. +Implementing a run-time requires an `unsafe` trait impl, making it clear that +care must be taken. The API requires run-times to just return `false` on +unknown features, making them conservative in such a way that prevents +unsoundness in safe Rust code. If a run-time doesn't support a feature, safe +Rust might panic, or run slower, but it will not try to run code that requires +an unsupported feature. -* **cdylibs**: dynamic libraries get the same default run-time as Rust binaries, - i.e., the `libstd` one if `libstd` is linked, and one that returns `false` if - the `cdylib` is `#![no_std]`, in which case the `cdylib` can provide their - own. - -* **embedded systems**: binaries running on read-only memory (e.g. in a ROM) can - implement a run-time that, e.g., does not cache any results, which would - require read-write memory, and instead, recomputes results on all invocations, - always returns false, contains features for different CPUs pre-computed in - read-only memory, and only detects the CPU type, etc. Even when implementing a - feature cache, one often needs to choose between using atomics, thread-locals, - mutexes, or no synchronization if the application is single-threaded. Not all - embedded systems support all these features. +If a program never performs any run-time feature detection, all +detection-related code is dead. LTO should be able to remove this code, but if +this were to fail, users can always define a dummy run-time that always returns +false, and has no caches, etc. + +The run-time feature-detection API dispatches calls to the run-time only when +necessary. If the default run-time isn't "the best" along some axis for some +application, this RFC allows the application to replace them with a better one. +With this RFC, there is no reason not to use the run-time feature detection +macros. ## Alternatives -We could not solve this problem. In which case, `libcore` can't use run-time -feature detection, e.g., to use advanced SIMD instructions. +We don't have to solve this problem. This means that `libcore` and other +`#![no_std]` libraries can't use run-time feature detection, and can't benefit, +e.g., of advanced SIMD instructions. We also could do something different. For example, we could provide a "cache" in libcore, and an API for users or only for the standard library, to initialize this cache externally, e.g., during the standard library initialization routine. -This runs into problems with `cdylibs`, where these routines might not be -called. It also runs into problems with often imposing a cost on users, e.g., -due to a cache in libcore, even though users might never use it. This would be -limiting, if e.g. having a cache in read-write memory prevents libcore from -being compiled to a read-only binary. We would need to feature gate this -functionality to avoid these issues. +This runs into problems with `cdylib`s, where these routines might not be called +automatically, potentially requiring C code to have to manually call into +`libstd` initialization routines. It also runs into problems with often imposing +a cost on users, e.g., due to a cache in libcore, even though users might never +use it. This would be limiting, if e.g. having a cache in read-write memory +prevents libcore from being compiled to a read-only binary. We would need to +feature gate this functionality to avoid these issues. It isn't cross-domain either, e.g., an OS kernel would need to disable this functionality, and wouldn't be able to provide their own. So while they could @@ -262,7 +332,11 @@ work for all users. # Unresolved questions [unresolved-questions]: #unresolved-questions -None. +* Should the API use a `TargetFeature` `enum` or be stringly-typed like the + macros and use string literals? + +* Since the `TargetFeature` `enum` is architecture-specific, should it live in + `core::arch::{target_arch}::TargetFeature` ? # Future possibilities [future-possibilities]: #future-possibilities From a354f98b9ba7cbd17e527804a9148d6cdb210d13 Mon Sep 17 00:00:00 2001 From: gnzlbg Date: Sun, 14 Jul 2019 11:03:21 +0200 Subject: [PATCH 3/8] Improve summary --- text/0000-target-feature-runtime.md | 19 +++++++++++++++++-- 1 file changed, 17 insertions(+), 2 deletions(-) diff --git a/text/0000-target-feature-runtime.md b/text/0000-target-feature-runtime.md index 32f487d3961..43c864cc8e4 100644 --- a/text/0000-target-feature-runtime.md +++ b/text/0000-target-feature-runtime.md @@ -6,8 +6,23 @@ # Summary [summary]: #summary -This RFC allows `#![no_std]` binaries and libraries (e.g. like `libcore` and -`liballoc`) to perform run-time feature detection. +Right now, only `#![std]` Rust libraries and binaries can perform target-feature +detection at run-time via the stable APIs provided by the +`is_..._feature_detected!("target-feature")` macros in `libstd`. + +This RFC extends that support to allow `#![no_std]` binaries and libraries (e.g. +like `libcore` and `liballoc`) to perform target-feature detection at run-time +as well. + +This proposal achieves that by exposing the API from `libcore` and by allowing +users to provide their own run-time for performing target-feature detection. If +no user-defined run-time is provided, a fallback is provided. If `libstd` is +linked, this fallback is the current runtime, preserving the current stable Rust +behavior. + +This enables all Rust code to use the stable target-feature detection APIs, +while allowing final binary artifacts to customize its behavior to satisfy their +use-cases. # Motivation [motivation]: #motivation From ed24e71c557254ed0f39e127f6c7fe523556657f Mon Sep 17 00:00:00 2001 From: gnzlbg Date: Sun, 14 Jul 2019 15:17:42 +0200 Subject: [PATCH 4/8] Add introduction; do not allow overriding libstd runtime --- text/0000-target-feature-runtime.md | 102 ++++++++++++++++++++-------- 1 file changed, 75 insertions(+), 27 deletions(-) diff --git a/text/0000-target-feature-runtime.md b/text/0000-target-feature-runtime.md index 43c864cc8e4..2f9034e38f6 100644 --- a/text/0000-target-feature-runtime.md +++ b/text/0000-target-feature-runtime.md @@ -16,8 +16,8 @@ as well. This proposal achieves that by exposing the API from `libcore` and by allowing users to provide their own run-time for performing target-feature detection. If -no user-defined run-time is provided, a fallback is provided. If `libstd` is -linked, this fallback is the current runtime, preserving the current stable Rust +no user-defined run-time is provided, a fallback is provided. `libstd` provides +a target-feature detection run-time, preserving the current stable Rust behavior. This enables all Rust code to use the stable target-feature detection APIs, @@ -27,35 +27,76 @@ use-cases. # Motivation [motivation]: #motivation -Binaries and libraries using the `std` library can perform run-time feature -detection via the `is_x86_feature_detected!("avx")` architecture-specific -macros. +## Background on target features -This operation requires, in general, operating system support, and is therefore -not available in `libcore`, which is operating system agnostic. +> **Note**: if you know what target features are and how to write code that +> conditionally uses them from Rust you can safely skip this sub-section. -That is, `#![no_std]` libraries, like `liballoc` and `libcore`, cannot perform -run-time feature detection, even though `libstd` often ends up being linked into -the final binary. +A Rust target triple, like `x86_64-apple-darwin`, produce binaries that can run +on all CPUs of the `x86_64` family that support certain architecture +"extensions"; for this particular case, all CPUs the binary runs on must support +the SSE3 vector extensions. That is, all Rust programs compiled for this target +can safely make use of SSE3 instructions, since all CPUs where those binaries +are allowed to run support them. On the other hand, the +`x86_64-unknown-linux-gnu` target only requires SSE2 vector extensions. For a +binary to use SSE3 instructions, it would first need to check whether the CPU +supports them, since this is not necessarily the case. In Rust, the behavior of +attempting to execute an unsupported instruction is undefined, and the compiler +optimizes under the assumption that this does not happen. -This results in some crates in crates.io having much better performance than the -methods of the types provided by `libcore`, like `&str`, `[T]`, `Iterator`, etc. +In Rust, we call `x86_64` the target architecture "family", and extensions like +SSE2 or SSE3 "target-features". -One example is the `is_sorted` crate, which provides an implementation of -`Iterator::is_sorted`, which performs 16x better than the `libcore` -implementation by using AVX. Another example include the `memchr` crate, as well -as crates implementing algorithms to compute whether a `[u8]` is an ASCII -string, or an UTF-8 string. These perform on the ballpark of about 1.6x better -than the `libcore` implementations, by using AVX on x86. +Currently, target-features can be detected: + +* at compile-time: using `#[cfg(target_feature = literal)]` to conditionally + compile code. +* at run-time: using the `is_{target_arch}_feature_detected!(literal)` macros + from the standard library to query whether the system the binary runs on + actually supports a feature or not. + +## Problem statement + +The `cfg(target_feature)` macro can be used by all Rust code, but is limited to +the set of features that are unconditionally enabled for the target. + +The architecture-specific `is_{target_arch}_feature_detected!(literal)` macros +require operating-system support and are therefore only exposed by the standard +library; `#![no_std]` libraries, like `liballoc` and `libcore` are platform +agnostic and cannot currently perform run-time feature detection. + +That is, currently, libraries have to choose between being `#![no_std]`-compatible, +or performing target-feature detection at run-time. -For `#![no_std]` binaries, the standard library is not linked into the final -binary, and they cannot use any library that uses the runtime feature detection -macros, because they are not available. +As a consequence, there are crates in `crates.io` re-implementing methods of +`libcore` types like `&str`, `[T]`, `Iterator`, etc. with much better +performance. + +One example is the `is_sorted` crate, which provides an implementation of +`Iterator::is_sorted`, which performs 16x better for some inputs than the +`libcore` implementation by using AVX. Another example include the `memchr` +crate, as well as crates implementing algorithms to compute whether a `[u8]` is +an ASCII string or an UTF-8 string, which end up being used every time a program +calls `String::from_utf8`. By using AVX on x86, these perform on the ballpark of +about 1.6x better than the `libcore` implementations, and could probably do +better using AVX-512. Most Rust code cannot, however, benefit from them, +because they will be using `String::from_utf8` via the standard library. + +This is a shame. Whether a library is `#![no_std]` or not is orthogonal to +whether the final binary is able to perform run-time feature detection and most +binaries using `#![no_std]` crates do end up linking `libstd` into the final +binary. + +On the other hand, `#![no_std]` binaries cannot use any library that uses the +runtime feature detection macros, even though for these it would be better to +just report that all features are disabled instead of splitting the ecosystem. The goal of this RFC is to enable `#![no_std]` libraries and binaries to perform run-time feature detection. -However, `#![no_std]` libraries and binaries are used in a wider-range of +## Use cases + +`#![no_std]` libraries and binaries are used in a wider-range of applications than `#![std]` libraries ones, and they might often want to perform run-time feature detection differently. Among others: @@ -170,9 +211,9 @@ The `static` variable must implement the `core::detect::TargetFeatureRuntime` `trait`. If no `#[target_feature_detection_runtime]` is provided anywhere in the -dependency graph, Rust provides a default definition. For `#![no_std]` binaries -and dynamic libraries, that is, for binaries and libraries that do not link -against `libstd`, this definition always returns `false` (it does nothing). +dependency graph, Rust provides a default definition that always returns `false` +(no feature is detected). When `libstd` is linked, it provides a target-feature +detection run-time. ## The `core::detect::TargetFeatureRuntime` trait [runtime-trait]: #runtime-trait @@ -257,8 +298,8 @@ is_{architecture}_feature_detected!(string-literal) -> bool; # Drawbacks [drawbacks]: #drawbacks -This increases the complexity of the implementation, adding another singleton -run-time component. +This increases the complexity of the implementation, adding another "singleton" +run-time component. # Rationale and alternatives [rationale-and-alternatives]: #rationale-and-alternatives @@ -353,6 +394,13 @@ work for all users. * Since the `TargetFeature` `enum` is architecture-specific, should it live in `core::arch::{target_arch}::TargetFeature` ? +* How does it fit with the Roadmap? Does it fit with the Roadmap at all? Would + it fit with any future Roadmap? + +* Should the `libstd` run-time be overridable? For example, by only providing it + if no other crate in the dependency graph provides a runtime ? This would be a + forward-compatible extension, but no use case considered requires it. + # Future possibilities [future-possibilities]: #future-possibilities From 7cdaff85ffcabff0caceaa75ced6002cea71289a Mon Sep 17 00:00:00 2001 From: gnzlbg Date: Sun, 14 Jul 2019 15:39:09 +0200 Subject: [PATCH 5/8] Reword intro --- text/0000-target-feature-runtime.md | 22 +++++++++++----------- 1 file changed, 11 insertions(+), 11 deletions(-) diff --git a/text/0000-target-feature-runtime.md b/text/0000-target-feature-runtime.md index 2f9034e38f6..2173d235c36 100644 --- a/text/0000-target-feature-runtime.md +++ b/text/0000-target-feature-runtime.md @@ -34,18 +34,18 @@ use-cases. A Rust target triple, like `x86_64-apple-darwin`, produce binaries that can run on all CPUs of the `x86_64` family that support certain architecture -"extensions"; for this particular case, all CPUs the binary runs on must support -the SSE3 vector extensions. That is, all Rust programs compiled for this target -can safely make use of SSE3 instructions, since all CPUs where those binaries -are allowed to run support them. On the other hand, the -`x86_64-unknown-linux-gnu` target only requires SSE2 vector extensions. For a -binary to use SSE3 instructions, it would first need to check whether the CPU -supports them, since this is not necessarily the case. In Rust, the behavior of -attempting to execute an unsupported instruction is undefined, and the compiler -optimizes under the assumption that this does not happen. +"extensions". This particular target requires SSE3 vector extensions, that is, +binaries compiled for this target are only able to run on CPUs that support this +particular extension. Other targets require different sets of extensions. For +example, `x86_64-unknown-linux-gnu` only requires SSE2 support, allowing +binaries to run on CPUs that do not support SSE3. In Rust, we call `x86_64` the target architecture "family", and extensions like -SSE2 or SSE3 "target-features". +SSE2 or SSE3 "target-features". The behavior of attempting to execute an +unsupported instruction is undefined, and the compiler optimizes under the +assumption that this does not happen. It is therefore crucial for Rust code to +be able to make sure that these extensions are only used when they are +available. Currently, target-features can be detected: @@ -70,7 +70,7 @@ or performing target-feature detection at run-time. As a consequence, there are crates in `crates.io` re-implementing methods of `libcore` types like `&str`, `[T]`, `Iterator`, etc. with much better -performance. +performance by using target-feature detection at run-time. One example is the `is_sorted` crate, which provides an implementation of `Iterator::is_sorted`, which performs 16x better for some inputs than the From 4aced3b862adffa58d2f9ca6da9add1259ef59c2 Mon Sep 17 00:00:00 2001 From: gnzlbg Date: Sun, 14 Jul 2019 15:53:44 +0200 Subject: [PATCH 6/8] Note that not everything must be stabilized simultaneously --- text/0000-target-feature-runtime.md | 34 ++++++++++++++++++++++++----- 1 file changed, 28 insertions(+), 6 deletions(-) diff --git a/text/0000-target-feature-runtime.md b/text/0000-target-feature-runtime.md index 2173d235c36..9d8fb9e4067 100644 --- a/text/0000-target-feature-runtime.md +++ b/text/0000-target-feature-runtime.md @@ -193,6 +193,16 @@ This RFC introduces: * a new enum: `core::detect::TargetFeature`, and * a new function: `core::detect::is_target_feature_detected`. +and moves the target-feature detection macros to `libcore`. + +We could stabilize all of this in stages. First, we could just stabilize using +the target-feature detection macros from `libcore`, which would unlock doing +target-feature detection in `#![no_std]` libraries. Unlocking some of the main +use-cases like being able to use them in `libcore` itself, and have them do +something meaningful when `libstd` is linked. Then, we could extend that with +the rest of the API, which allows `#![no_std]` binaries to provide their own +target-feature detection runtime. + ## The `#[target_feature_detection_runtime]` attribute The `#[target_feature_runtime]` can be used to _define_ a target-feature @@ -353,13 +363,18 @@ macros. ## Alternatives -We don't have to solve this problem. This means that `libcore` and other -`#![no_std]` libraries can't use run-time feature detection, and can't benefit, -e.g., of advanced SIMD instructions. +The main alternative is that we don't have to stabilize everything at the same +time. We could implement this as proposed, but only stabilize using the +target-feature detection macros via `libcore`. This would mean that initially, +`#![no_std]` binaries won't be able to implement their own run-times, but that +would unlock using the macros on all `#![no_std]` libraries, and these macros +would do something meaningful if `libstd` is linked into the final binary. -We also could do something different. For example, we could provide a "cache" in -libcore, and an API for users or only for the standard library, to initialize -this cache externally, e.g., during the standard library initialization routine. +## libcore pulls target-features approach + +We could provide a "cache" in libcore, and an API for users or only for the +standard library, to initialize this cache externally, e.g., during the standard +library initialization routine. This runs into problems with `cdylib`s, where these routines might not be called automatically, potentially requiring C code to have to manually call into @@ -388,6 +403,13 @@ work for all users. # Unresolved questions [unresolved-questions]: #unresolved-questions +* We could implement this RFC, without making any APIs public, by just moving + the feature-detection macros to `libcore`. That would allow `#![no_std]` + libraries to use them, and they will do something meaninful if `libstd` is + linked. `#![no_std]` binaries won't be able to provide their own run-time, but + the APIs for this (the trait, enum, and `#[target_feature_detection_runtime]` + attribute) could be stabilized at a later time. + * Should the API use a `TargetFeature` `enum` or be stringly-typed like the macros and use string literals? From 8396166c67cc6a6f9a4cf7dea6a675ba50279207 Mon Sep 17 00:00:00 2001 From: gnzlbg Date: Sun, 14 Jul 2019 16:11:07 +0200 Subject: [PATCH 7/8] Reword summary --- text/0000-target-feature-runtime.md | 29 ++++++++++++----------------- 1 file changed, 12 insertions(+), 17 deletions(-) diff --git a/text/0000-target-feature-runtime.md b/text/0000-target-feature-runtime.md index 9d8fb9e4067..5a807698c32 100644 --- a/text/0000-target-feature-runtime.md +++ b/text/0000-target-feature-runtime.md @@ -6,23 +6,18 @@ # Summary [summary]: #summary -Right now, only `#![std]` Rust libraries and binaries can perform target-feature -detection at run-time via the stable APIs provided by the -`is_..._feature_detected!("target-feature")` macros in `libstd`. - -This RFC extends that support to allow `#![no_std]` binaries and libraries (e.g. -like `libcore` and `liballoc`) to perform target-feature detection at run-time -as well. - -This proposal achieves that by exposing the API from `libcore` and by allowing -users to provide their own run-time for performing target-feature detection. If -no user-defined run-time is provided, a fallback is provided. `libstd` provides -a target-feature detection run-time, preserving the current stable Rust -behavior. - -This enables all Rust code to use the stable target-feature detection APIs, -while allowing final binary artifacts to customize its behavior to satisfy their -use-cases. +Right now, the `is_..._feature_detected!("target-feature")` macros exported by +`libstd` are the only proper way in which Rust libraries and binaries can +perform runtime target-feature detection. + +This RFC extends that support to `#![no_std]` libraries by moving the +target-feature detection macros to `libcore`. This enables all Rust libraries, +including `libcore`, to perform target-feature detection at run-time. + +The implementation proposed can be, as an extension, stabilized. This would +allow `#![no_std]` binaries to provide their own target-feature-detection +run-time. This would allow `#![no_std]` binaries to benefit from +target-feature-detection as well. # Motivation [motivation]: #motivation From 7105ae603fe1a2a2a85d7c5622312274ae152e8b Mon Sep 17 00:00:00 2001 From: gnzlbg Date: Sun, 14 Jul 2019 16:59:02 +0200 Subject: [PATCH 8/8] Reword text --- text/0000-target-feature-runtime.md | 208 ++++++++++++++++------------ 1 file changed, 120 insertions(+), 88 deletions(-) diff --git a/text/0000-target-feature-runtime.md b/text/0000-target-feature-runtime.md index 5a807698c32..e04552484bd 100644 --- a/text/0000-target-feature-runtime.md +++ b/text/0000-target-feature-runtime.md @@ -8,39 +8,43 @@ Right now, the `is_..._feature_detected!("target-feature")` macros exported by `libstd` are the only proper way in which Rust libraries and binaries can -perform runtime target-feature detection. +perform target-feature detection at run-time. -This RFC extends that support to `#![no_std]` libraries by moving the +This RFC extends that support to `#![no_std]` libraries, by moving the target-feature detection macros to `libcore`. This enables all Rust libraries, including `libcore`, to perform target-feature detection at run-time. The implementation proposed can be, as an extension, stabilized. This would allow `#![no_std]` binaries to provide their own target-feature-detection -run-time. This would allow `#![no_std]` binaries to benefit from -target-feature-detection as well. +run-time and benefit from it as well. # Motivation [motivation]: #motivation -## Background on target features +## Refresher on target features -> **Note**: if you know what target features are and how to write code that -> conditionally uses them from Rust you can safely skip this sub-section. +> You can safely skip this sub-section if you are familiar with compile-time and +> run-time target-feature detection in Rust. A Rust target triple, like `x86_64-apple-darwin`, produce binaries that can run on all CPUs of the `x86_64` family that support certain architecture -"extensions". This particular target requires SSE3 vector extensions, that is, -binaries compiled for this target are only able to run on CPUs that support this -particular extension. Other targets require different sets of extensions. For -example, `x86_64-unknown-linux-gnu` only requires SSE2 support, allowing -binaries to run on CPUs that do not support SSE3. - -In Rust, we call `x86_64` the target architecture "family", and extensions like -SSE2 or SSE3 "target-features". The behavior of attempting to execute an -unsupported instruction is undefined, and the compiler optimizes under the -assumption that this does not happen. It is therefore crucial for Rust code to -be able to make sure that these extensions are only used when they are -available. +"extensions". This particular target requires SSE3 vector extensions, and Rust +will emits them whenever it deems fit. As a consequence, binaries compiled for +this target can only on CPUs that support SSE3 extension. Other targets require +different sets of extensions. For example, `x86_64-unknown-linux-gnu` only +requires SSE2 support, allowing binaries to run on CPUs that do not support +SSE3. In Rust, we call `x86_64` the target architecture "family", and extensions +like SSE2 or SSE3 "target-features". + +Many Rust applications compiled for `x86_64-unknonw-linux-gnu` do want to use +SSE3 extensions when the CPU the binary runs on, and Rust allows enabling these +extensions via the `#[target_feature]` function attribute. The behavior of a +program that attempts to execute code that uses an extension that is not +supported by the CPU in which the binary runs on is undefined, and the compiler +generates machine code under the assumption that this does not happen. For such +programs to be safe, they need to detect whether the CPU in which the binary +runs on supports the particular features that they want to use, and only use +them when the CPU actually supports them. Currently, target-features can be detected: @@ -52,39 +56,40 @@ Currently, target-features can be detected: ## Problem statement -The `cfg(target_feature)` macro can be used by all Rust code, but is limited to -the set of features that are unconditionally enabled for the target. +The `cfg(target_feature = "target_feature_literal")` macro can be used by all +Rust code, but is limited to the set of features that are unconditionally +enabled for the target. -The architecture-specific `is_{target_arch}_feature_detected!(literal)` macros -require operating-system support and are therefore only exposed by the standard -library; `#![no_std]` libraries, like `liballoc` and `libcore` are platform -agnostic and cannot currently perform run-time feature detection. +The architecture-specific +`is_{target_arch}_feature_detected!(target_feature_literal)` macros require +operating-system support and are therefore only exposed by the standard library; +`#![no_std]` libraries, like `liballoc` and `libcore` are platform agnostic and +cannot currently perform run-time feature detection. That is, currently, libraries have to choose between being `#![no_std]`-compatible, or performing target-feature detection at run-time. As a consequence, there are crates in `crates.io` re-implementing methods of -`libcore` types like `&str`, `[T]`, `Iterator`, etc. with much better -performance by using target-feature detection at run-time. +`libcore` types like `&str`, `[T]`, `Iterator`, etc. but with much better +performance, by using target-feature detection at run-time. One example is the `is_sorted` crate, which provides an implementation of `Iterator::is_sorted`, which performs 16x better for some inputs than the -`libcore` implementation by using AVX. Another example include the `memchr` -crate, as well as crates implementing algorithms to compute whether a `[u8]` is -an ASCII string or an UTF-8 string, which end up being used every time a program -calls `String::from_utf8`. By using AVX on x86, these perform on the ballpark of -about 1.6x better than the `libcore` implementations, and could probably do -better using AVX-512. Most Rust code cannot, however, benefit from them, -because they will be using `String::from_utf8` via the standard library. +`libcore` implementation by using AVX when available. Another example include +the `memchr` crate, as well as crates implementing algorithms to compute whether +a `[u8]` is an ASCII string or an UTF-8 string, which end up being used every +time a program calls `String::from_utf8`. By using AVX on x86, these perform on +the ballpark of about 1.6x better than the `libcore` implementations, and could +probably do better using AVX-512. Most Rust does not, however, benefit from +these, because this code calls `str::from_utf8` which is part of `libcore` which +cannot use run-time target-feature detection.. This is a shame. Whether a library is `#![no_std]` or not is orthogonal to whether the final binary is able to perform run-time feature detection and most binaries using `#![no_std]` crates do end up linking `libstd` into the final -binary. - -On the other hand, `#![no_std]` binaries cannot use any library that uses the -runtime feature detection macros, even though for these it would be better to -just report that all features are disabled instead of splitting the ecosystem. +binary. Simultaneously, `#![no_std]` binaries cannot use any library that +performs run-time target-feature detection, even though it would be perfectly +safe for the API to just return that no features are detected at run-time. The goal of this RFC is to enable `#![no_std]` libraries and binaries to perform run-time feature detection. @@ -130,12 +135,13 @@ initialization overhead, no extra memory usage, and no code-size or binary size. # Guide-level explanation [guide-level-explanation]: #guide-level-explanation -Users can continue to perform run-time feature detection by using the -`is_{architecture}_feature_detected!` macros. These macros were previously only -available from libstd, and are now available in `libcore`. That is, `#![no_std]` -libraries and binaries can use them. +Users can continue to perform run-time feature detection by using the stable +`is_{architecture}_feature_detected!` macros. This RFC makes this macros +available in `libcore`, such that `#![no_std]` libraries and binaries can use +them. -Users can now provide their own target-feature detection run-time: +As an extension, this RFC also allows users to provide their own target-feature +detection run-time: ```rust #[target_feature_detection_runtime] @@ -143,8 +149,8 @@ static TargetFeatureRT: impl core::detect::TargetFeatureRuntime; ``` by using the `#[target_feature_detection_runtime]` attribute on a `static` -variable of a type that implements the `core::detect::TargetFeatureRuntime` `trait` (see -[definition below][runtime-trait]). +variable of a type that implements the `core::detect::TargetFeatureRuntime` +`trait` (see [definition below][runtime-trait]). This is analogous to how the `#[global_allocator]` is currently defined in Rust programs. @@ -178,25 +184,61 @@ unsafe impl core::detect::TargetFeatureRuntime for Runtime { } ``` +The initial user of this feature will be `libstd` itself, which will use it to +implement its own target-feature detection run-time. When `libstd` is linked +into the final binary, the target-feature detection macros will use this +run-time to detect the available target-features. + +This extension could be considered an "implementation-detail" of how to expose +the feature-detection macros in `libcore`, and can be technically stabilized at +a later time. That is, we could expose the feature-detection macros in libcore +first, worrying about the details of how to make that configurable at a later +time. + +This RFC works these details out and proposes a concrete design for them. + # Reference-level explanation [reference-level-explanation]: #reference-level-explanation -This RFC introduces: +This RFC exports the target-feature detection macros from `libcore`, and +introduces: * a new attribute: `#[target_feature_detection_runtime]`, * a new trait: `core::detect::TargetFeatureRuntime`, * a new enum: `core::detect::TargetFeature`, and * a new function: `core::detect::is_target_feature_detected`. -and moves the target-feature detection macros to `libcore`. +Stabilizing the usage of the target-feature detection macros from `libcore` +could be done before stabilizing the rest of the APIs proposed here, and would +allow all `#![no_std]` libraries including `libcore` to use run-time +target-feature detection, and benefit from it if `libstd` is linked into the +final binary. -We could stabilize all of this in stages. First, we could just stabilize using -the target-feature detection macros from `libcore`, which would unlock doing -target-feature detection in `#![no_std]` libraries. Unlocking some of the main -use-cases like being able to use them in `libcore` itself, and have them do -something meaningful when `libstd` is linked. Then, we could extend that with -the rest of the API, which allows `#![no_std]` binaries to provide their own -target-feature detection runtime. +The rest of the API could be initially left as unstable and remain only used by +`libstd`. Stabilizing it would, however, allow `#![no_std]` binaries to benefit +from proper target-feature detection as well. + +## Export target-feature detection macros from libcore + +This RFC exports the feature-detection macros from `libcore`. Right now, the +only stable feature-detection macro is +`is_x86_feature_detected!("target_feature_name")`. + +If the rest of the API is stabilized, the semantics of these macros could be +made more precise, by using the rest of the API proposed here in their +specification: + +```rust +/// Returns `true` if `cfg!(target_feature = target-feature-literal)` is +/// `true`, and returns the value of `core::detect::is_feature_detected` +/// for the target-feature otherwise. +/// +/// If the target-feature is not a known target-feature for the current +/// `architecture`, or the required `feature()` gate to use the feature +/// is not enabled, the program is ill-formed, and a compile-time +/// diagnostic is emitted. +is_{architecture}_feature_detected!(string-literal) -> bool; +``` ## The `#[target_feature_detection_runtime]` attribute @@ -217,14 +259,24 @@ The `static` variable must implement the `core::detect::TargetFeatureRuntime` If no `#[target_feature_detection_runtime]` is provided anywhere in the dependency graph, Rust provides a default definition that always returns `false` -(no feature is detected). When `libstd` is linked, it provides a target-feature -detection run-time. +(no feature is detected). + +The standard library provides a target-feature detection run-time for some Rust +targets, and attempting to provide a user-defined run-time for these targets is +illegal, since that would result in two run-times being part of the dependency +graph. + +Being able to override the run-time provided by `libstd` could be pursued as an +extension, but at the time of this writing no use cases for this feature have +been found. This extension would work by only linking the `libstd` run-time if +there is no run-time in the dependency graph, similarly to how +`#[global_allocator]` currently works. ## The `core::detect::TargetFeatureRuntime` trait [runtime-trait]: #runtime-trait -The run-time must be a `static` variable of a type that implements the -`core::detect::TargetFeatureRuntime` trait: +The target-feature detection run-time must be a `static` variable of a type that +implements the `core::detect::TargetFeatureRuntime` trait: ```rust unsafe trait core::detect::TargetFeatureRuntime { @@ -235,10 +287,15 @@ unsafe trait core::detect::TargetFeatureRuntime { } ``` -This `trait`, which is part of `libcore`, is `unsafe` to implement. A correct -implementation, satisfying the specified semantics of its methods is required -for soundness of safe Rust code. That is, an incorrect implementation can cause -safe Rust code to have undefined behavior. +This `trait` is `unsafe` to implement, and a correct implementation is required +for soundness of safe Rust code. In particular, the trait method shall only +return that a feature is supported by the current thread of execution if this is +actually the case. An incorrect implementation of this trait could cause "safe" +Rust code to have undefined behavior. + +Note that the `TargetFeature` enum (see below) is `#[non_exhaustive]`, that is, +matching on this enum is required to handle unknown enum variants, and it is +always correct to return that unknown features are not available at run-time. ## The `core::detect::TargetFeature` enum @@ -275,31 +332,6 @@ fn is_target_feature_detected(feature: core::detect::TargetFeature) -> bool; This function calls the `TargetFeatureRuntime::is_target_feature_detected` method. ---- - -Finally, this RFC moves the feature-detection macros of `libstd` to `libcore`. -Right now, the only stable feature-detection macro is -`is_x86_feature_detected!("target_feature_name")`. - -The semantics of these macros are modified to: - -```rust -/// Returns `true` if `cfg!(target_feature = string-literal)` is `true`, and -/// returns the value of `core::detect::is_feature_detected` for the feature -/// otherwise. -/// -/// If `feature` is not known to be a valid feature for the current -/// `architecture`, or the required `feature()` gates to use the feature are -/// not enabled, the program is ill-formed, and a compile-time diagnostic is -/// emitted. -is_{architecture}_feature_detected!(string-literal) -> bool; -``` - -> Implementation note: currently, the compilation-errors are emitted by the -> macro by pattern-matching on the literals. The mapping from the literals to -> the variants of the `TargetFeature` enum happens also at compile-time by -> pattern matching the literals. - # Drawbacks [drawbacks]: #drawbacks