-
Notifications
You must be signed in to change notification settings - Fork 1.6k
add const-ub RFC #3016
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
add const-ub RFC #3016
Changes from 10 commits
Commits
Show all changes
18 commits
Select commit
Hold shift + click to select a range
bb47320
add const-ub RFC
RalfJung 656a14a
typos
RalfJung 40abf68
be mroe clear about the lack of stability
RalfJung 2781683
future possibility: a flag to disable UB checking
RalfJung e78333a
extend discussion of intrinsics and library UB
RalfJung a7c3034
tweak wording
RalfJung 5045d5f
some clarifications
RalfJung e1a29a7
more precise wording
RalfJung a90a538
clarify
RalfJung 60bef15
better language and further clarification
RalfJung 9eead86
require implementations to document the 'obvious'
RalfJung 6e1739f
rewrite RFC: do not require UB detection
RalfJung 7983e46
Update text/0000-const-ub.md
RalfJung b515180
edits
RalfJung 3e9cbb5
clarify that CTFE remains consistent
RalfJung b1734f8
move to final text location.
pnkfelix 54f7286
My understanding is that RFC 3016 is just codifying existing behavior.
pnkfelix fcf400b
add link to RFC PR itself.
pnkfelix File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,135 @@ | ||
- Feature Name: `const_ub` | ||
- Start Date: 2020-10-10 | ||
- RFC PR: [rust-lang/rfcs#0000](https://github.com/rust-lang/rfcs/pull/0000) | ||
- Rust Issue: [rust-lang/rust#0000](https://github.com/rust-lang/rust/issues/0000) | ||
|
||
# Summary | ||
[summary]: #summary | ||
|
||
Define how UB during const evaluation is treated: | ||
some kinds of UB must be detected, the remaining UB conditions are ignored and evaluation continues in a well-defined way. | ||
However, CTFE queries causing UB are not subject to stability guarantees and thus may fail to build in the future (e.g. when more UB is being detected). | ||
|
||
# Motivation | ||
[motivation]: #motivation | ||
|
||
So far, nothing is specified about what happens when `unsafe` code leads to UB during CTFE. | ||
This is a major blocker for stabilizing `unsafe` operations in const-contexts. | ||
|
||
# Guide-level explanation | ||
[guide-level-explanation]: #guide-level-explanation | ||
|
||
There are some values that Rust needs to compute at compile-time. | ||
This includes the initial value of a `const`/`static`, and array lengths (and more general, const generics). | ||
Computing these initial values is called compile-time function evaluation (CTFE). | ||
CTFE in Rust is very powerful and permits running almost arbitrary Rust code. | ||
This raises the question, what happens when there is `unsafe` code and it causes [Undefined Behavior (UB)][UB]? | ||
|
||
The answer depends on the kind of UB: some kinds of UB are guaranteed to be detected, | ||
while other kinds of UB might either be detected, or else evaluation will continue as if the violated UB condition did not exist (i.e., as if this operation was actually defined). | ||
This can change from compiler version to compiler version: CTFE code that causes UB could build fine with one compiler and fail to build with another. | ||
|
||
This RFC does not alter the general policy that unsound code is not subject to strict stability guarantees. | ||
In other words, unsafe code may not rely on all future versions of Rust to implement this RFC. | ||
The RFC only helps *consumers* of unsafe code to be sure that right now, all UB during CTFE will be detected or non-consequential (i.e., evaluation will proceed as if there was no UB). | ||
It does not grant any new possibilities to *authors* of unsafe code; in particular, it is still considered a critical bug for CTFE code to raise UB, and no stability guarantees are made for such code (as is the case with regular runtime code raising UB). | ||
|
||
[UB]: https://doc.rust-lang.org/reference/behavior-considered-undefined.html | ||
|
||
# Reference-level explanation | ||
[reference-level-explanation]: #reference-level-explanation | ||
|
||
The following kinds of UB are detected by CTFE, and will cause compilation to stop with an error: | ||
* Dereferencing dangling pointers. | ||
* Using an invalid value in an arithmetic, logical or control-flow operation. | ||
|
||
These kinds of UB have in common that there is nothing sensible evaluation can do besides stopping with an error. | ||
|
||
Other kinds of UB might or might not be detected depending on the implementation: | ||
* Dereferencing unaligned pointers. | ||
* Violating Rust's aliasing rules. | ||
* Producing an invalid value (but not using it in one of the ways defined above). | ||
* Any [other UB][UB] not listed here. | ||
|
||
Implementations should document which of these kinds of UB they detect. | ||
In rustc, none of this UB will be detected for now. | ||
However, code causing any kind of UB is still considered buggy and not subject to stability guarantees. | ||
Hence, rustc may start detecting more UB in the future. | ||
|
||
All of this UB has in common that there is an "obvious" way to continue evaluation even though the program has caused UB: | ||
we can just access the underlying memory despite alignment and/or aliasing rules being violated, and we can just ignore the existence of an invalid value as long as it is not used in some arithmetic, logical or control-flow operation. | ||
There is no guarantee that CTFE detects such UB: evaluation may either fail with an error, or continue with the "obvious" result. | ||
|
||
In particular, the RFC does not mandate whether UB caused by implementation-defined compiler intrinsics (insofar as they are supported by CTFE) is detected. | ||
However, implementations should document for each intrinsic whether UB is detected, and (if UB is ignored for an intrinsic), what the behavior of CTFE will be when UB occurs. | ||
For rustc, all intrinsic-specific UB (e.g., reaching an `unreachable` or violating the assumptions of `exact_div`) will be detected, but if intrinsics perform memory accesses, they are treated like regular accesses for UB detection (e.g., aliasing or alignment violations are not detected, and execution proceeds just ignoring this check). | ||
|
||
The RFC also does not mandate detecting any library UB, i.e., UB caused by violating the contract of a (standard) library function. | ||
The same conditions as for intrinsics apply: implementations should document which UB is detected. | ||
If library UB is ignored, execution must continue by just following the rules of the Abstract Machine for the current implementation of the library function, treating it as if that code had no contract applied to it. | ||
In rustc, no library UB will be detected. | ||
|
||
If the compile-time evaluation uses operations that are specified as non-deterministic, | ||
and only some of the non-deterministic choices lead to CTFE-detected UB, | ||
then CTFE may choose any possible execution and thus miss the possible UB. | ||
For example, if we end up specifying the value of padding after a typed copy to be non-deterministically chosen, then padding will be initialized in some executions and uninitialized in others. | ||
If the program then performs integer arithmetic on a padding byte, that might or might not be detected as UB, depending on the non-deterministic choice made by CTFE. | ||
|
||
## Note to implementors | ||
|
||
This requirement implies that CTFE must happen on code that was *not subject to UB-exploiting optimizations*. | ||
In general, optimizations of Rust code may assume that the source program does not have UB, so programs that exhibit UB can simply be ignored when arguing for the correctness of an optimization. | ||
However, this can lead to programs with UB being translated into programs without UB, so if constant evaluation runs after such an optimization, it might fail to detect the UB. | ||
The only permissible optimizations are those that preserve all UB and that preserve the behavior of programs whose UB CTFE does not detect. | ||
Formally speaking this means they must be correct optimizations for the abstract machine *that CTFE actually implements*, not just for the abstract machine that specifies Rust; and moreover they must preserve the location and kind of UB that is detected by CTFE. | ||
|
||
# Drawbacks | ||
[drawbacks]: #drawbacks | ||
|
||
To be able to either detect UB or continue evaluation in a well-defined way, CTFE must run on unoptimized code. | ||
This means when compiling a `const fn` in some crate, the unoptimized code needs to be stored. | ||
So either the code is stored twice (optimized and unoptimized), or optimizations can only happen after all CTFE results have been computed. | ||
[Experiments in rustc](https://perf.rust-lang.org/compare.html?start=35debd4c111610317346f46d791f32551d449bd8&end=3dbdd3b981f75f965ac04452739653a3d47ff0ed) showed a severe performance impact on CTFE stress-tests, but no impact on real code except for a slowdown of "incr-unchanged" (which are rather fast so small changes lead to large percentages). | ||
|
||
# Rationale and alternatives | ||
[rationale-and-alternatives]: #rationale-and-alternatives | ||
|
||
The most obvious alternative is to say that UB during CTFE will definitely be detected. | ||
However, that is expensive and might even be impossible. | ||
Even Miri does not currently detect all UB, and Miri is already performing many additional checks that would significantly slow down CTFE. | ||
Furthermore, implementing these checks requires a more precise understanding of UB than we currently have; basically, this would block having any potentially-UB operations at const-time on having a spec for Rust that precisely describes their UB in a checkable way. | ||
In particular, this would mean we need to decide on an aliasing model before permitting raw pointers in CTFE. | ||
|
||
To avoid the need for keeping the unoptimized sources of `const fn` around, we could weaken the requirement for detecting UB and instead say that UB might cause arbitrary evaluation results. | ||
Under the assumption that unsound code is not subject to the usual stability guarantees, this is an option we can still move to in the future, should it turn out that the proposal made in this RFC is too expensive. | ||
|
||
Another extreme alternative would be to say that UB during CTFE may have arbitrary effects in the host compiler, including host-level UB. | ||
Basically this would mean that CTFE would be allowed to "leave its sandbox". | ||
This would allow JIT'ing CTFE and running the resulting code unchecked. | ||
While compiling untrusted code should only be done with care (including additional sandboxing), this seems like an unnecessary extra footgun. | ||
|
||
# Prior art | ||
[prior-art]: #prior-art | ||
|
||
C++ requires compilers to detect UB in `constexpr`. | ||
However, the fragment of C++ that is available to `constexpr` excludes pointer casts, pointer arithmetic (beyond array bounds), and union-based type punning, which makes such checks not very complicated and avoids most of the poorly specified parts of UB. | ||
The corresponding type-punning-free fragment of Rust (no raw pointers, no `union`, no `transmute`) can only cause UB that is defined to be definitely detected during CTFE. | ||
In that sense, Rust achieves feature parity with C++ in terms of UB detection during CTFE. | ||
(Indeed, this was the prime motivation for making such strict UB detection requirements in the first place.) | ||
|
||
# Unresolved questions | ||
[unresolved-questions]: #unresolved-questions | ||
|
||
Currently none. | ||
|
||
# Future possibilities | ||
[future-possibilities]: #future-possibilities | ||
RalfJung marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
This RFC provides an easy way forward for "unconst" operations, i.e., operations that are safe at run-time but not at compile-time. | ||
Primary examples of such operations are anything involving the integer representation of pointers, which cannot be known at compile-time. | ||
If this RFC were accepted, we could declare such operations "definitely detected UB" during CTFE (and thus naturally they would only be permitted in an `unsafe` block). | ||
|
||
If UB checks turn out to be expensive, the RFC leaves the option of adding a flag to let users opt-out of UB checking. | ||
This will speed up compilation, and not change behavior of correct code. | ||
|
||
The RFC clarifies that there is no *guarantee* that code with UB is evaluated in any particular way, so if we want to detect more UB during CTFE in the future, we are free to do so from a stability perspective. |
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.