-
Notifications
You must be signed in to change notification settings - Fork 1.6k
RFC: Disable niche layout optimization on enum discriminants #3803
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
text/3803-enum-repr-no-niches.md
Outdated
## Syntax | ||
|
||
Current options and sub-options: | ||
1. Use `#[repr(_, something_here)]`. | ||
Advantage: The feature semantically modifies the layout/representation. | ||
1. `#[repr(_, no_niches)]`. | ||
Advantage: Very precise in what it does. | ||
Disadvantage: Uses "niche" which is a niche term. | ||
2. `#[repr(_, non_exhaustive)]`. | ||
Advantage: Clear that it implies `#[non_exhaustive]`. | ||
Disadvantage: Might be confusing for users which kind of `non_exhaustive` they should use. | ||
3. `#[repr(_, abi_stable)]`. | ||
3. `#[repr(_, all_bits_valid)]`. | ||
4. `#[repr(_, discriminant = no_niches)]` to mirror [RFC 3659](https://github.com/rust-lang/rfcs/pull/3659). | ||
2. Use `#[non_exhaustive(something_here)]`. | ||
Advantage: Might be easier to explain the effect to users ("this works just like `#[non_exhaustive]`, except stronger"). | ||
Advantage: Might align better with future additions to `#[non_exhaustive]`, such as [`#[non_exhaustive(pub)]`](https://internals.rust-lang.org/t/pre-rfc-relaxed-non-exhaustive-structs/11977). | ||
1. `#[non_exhaustive(no_niches)]`. | ||
2. `#[non_exhaustive(abi)]`. | ||
3. `#[non_exhaustive(repr)]`. | ||
4. `#[non_exhaustive(layout)]`. | ||
3. New attribute `#[really_non_exhaustive]` or similar. | ||
|
||
We have a kind of decision tree here, where some unresolved questions depend on the syntax. The exact syntax does not need to be decided before accepting the RFC, though we should choose one of the main "branches". | ||
|
||
The RFC author himself is undecided on the syntax, but decided to go with `#[repr(_, no_niches)]` for the body of the RFC, to make the desired semantics clearer to the RFC reader. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be T-opsem
purview in terms of "niche" vs. "ABI" vs. "non_exhaustive" terminology? See also rust-lang/unsafe-code-guidelines#122.
# Summary | ||
[summary]: #summary | ||
|
||
Allow `#[repr(_, no_niches)]` on field-less (C-style) enums to disable the niche layout optimization on discriminants, allowing using the enum directly in FFI. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think “no niche optimization” is a poor framing here, because in general “disable optimization” isn’t a sturdy foundation on which to build anything, and also because talking to me about niches doesn’t tell me about the exhaustiveness change unless I think about it hard. I think it would be better to:
- Call the repr modifier something like “
allow_undeclared_variants
”. - Reframe the RFC text around “this makes it allowed (not UB) to have a value of an undeclared variant” instead of about niches.
P.S. To put it differently: Don’t say "this turns off the thing that is incompatible with unknown variants", say "this allows you to use unknown variants" — it's more straightforward.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is why my initial (unpublished, but linked above) RFC called it #[repr(non_exhaustive)]
😉.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about the clang's terminology: #[open]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I especially think no_niches
is the wrong framing for this, because the main use case is for FFI, where niches aren't used anyway, since neither passing the enum directly nor using it in a repr(c) struct allow use of any niches.
I think the salient point here not that the compiler can't use niches for the type, but that the behavior for discrimant values outside of the defined variants is still well defined, that is it doesn't trigger Undefined Behavior.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about the clang's terminology:
#[open]
I proposed #[repr(open)]
elsewhere on this PR, but thinking of it a bit more, I think #[open]
might be a more appropriate option because it affects the semantics of the enum in a deeper way than, say, #[repr(C)]
or #[repr(u8)]
.
} | ||
``` | ||
|
||
This would be a fairly large bug, since the `Weather` enum suddenly grew to 2 bytes in size (instead, the `Unknown` variant should also have been updated to `u8 in 4..=255`). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Note that, as declared, Weather
would already have been 2 bytes, since #[repr(u8)]
explicitly separates the discriminant from the variants' fields, even if the discriminant could "fit" in a field's niche)
Maybe use like #[repr(bikeshed_guarantee_abi(u8))]
or something to be clear that Weather
is intended to be u8
-like, not "fieldful enum"-like
|
||
The exact output of `#[derive(Debug)]` is [an unresolved question][derivedebug]. | ||
|
||
User-defined derives and proc-macros will have to know when to emit the extra match arm (so most user-defined derive macros probably won't support `#[repr(_, no_niches)]` initially). Proc-macros could either choose to inspect the `repr`, or to defensively emit an extra `#[allow(unreachable_patterns)] _ => {}` match arm. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it doesn't exist already, it would be useful to publish a zero-dependency crate to inspect reprs and match at least the most common ones so that not every proc macro author needs to reinvent the wheel. Such a crate could then be mentioned here.
|
||
## Changing the default | ||
|
||
Making `#[repr(C)]` and `#[repr(uXX)]` enums have the `no_niches` modifier by default across an edition boundary could make interoperability with C easier and even further remove the footgun (anecdotally, I've seen several people surprised that `#[repr(C)] enum` isn't a good idea for interfacing with C enums). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would argue that repr(C)
should default to no niches, but repr(uN)
should still default to having niches, since it's effectively repr(Rust, uN)
which should default to niches.
|
||
Making `#[repr(C)]` and `#[repr(uXX)]` enums have the `no_niches` modifier by default across an edition boundary could make interoperability with C easier and even further remove the footgun (anecdotally, I've seen several people surprised that `#[repr(C)] enum` isn't a good idea for interfacing with C enums). | ||
|
||
Would need some way to re-opt-in to niches, maybe `#[repr(_, niches)]`? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Honestly, I would much prefer something along the lines of niches(on)
and niches(off)
, or just having both no_niches
and niches
available when this feature is added, for forward-compatibility. This way, people can explicitly opt into having niches
without worrying about whether the default is changed in a future edition.
1. `#[repr(_, no_niches)]`. | ||
|
||
Advantage: Very precise in what it does. | ||
|
||
Disadvantage: Uses "niche" which is a niche term. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also don't like no_niches
because it feels like a double-negative. To me, the purpose of the "niches" representation is so that I can be sure that extraneous values for my enum are literally impossible, not that these extraneous values are "niches" that can be filled by wrapper types like Option
.
For example, you can simplify a lot of operations when you know the values an enum can have explicitly. For example, if I have an enum for days of the week with a tag between 0-6, I know that if I take some number of days % 7, it will always become valid for that enum.
I don't know if there's a good alternative to "niche" as a name and don't think it's worth stalling the RFC for, but I figured it'd be worth voicing some of the downsides of the existing name, in case people do think it's worth changing for that reason.
|
||
3. `#[repr(_, abi_stable)]`. | ||
3. `#[repr(_, all_bits_valid)]`. | ||
4. `#[repr(_, discriminant = no_niches)]` to mirror [RFC 3659](https://github.com/rust-lang/rfcs/pull/3659). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about #[repr(_, open)]
as another option (after the option for Clang’s enum_extensibility
attribute)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If repr(open) could remove the need for crates like https://crates.io/crates/open-enum that would be cool
The `metal` crate is currently unsound regarding unknown/future enum variants, see gfx-rs/metal-rs#209 and rust-lang/rfcs#3803. `objc2-metal` fixes this by emitting C enums as a newtype + constants for each variant, but that prevents us from importing the variants/constants. So this commit converts to a pattern that works with that in preparation for the migration.
The `metal` crate is currently unsound regarding unknown/future enum variants, see gfx-rs/metal-rs#209 and rust-lang/rfcs#3803. `objc2-metal` fixes this by emitting C enums as a newtype + constants for each variant, but that prevents us from importing the variants/constants. So this commit converts to a pattern that works with that in preparation for the migration.
Allow
#[repr(_, no_niches)]
on field-less (C-style) enums to disable the niche layout optimization on discriminants, allowing using the enum directly in FFI.Enums with the
no_niches
modifier cannot be matched on exhaustively without the use of a wildcard arm. This feature is similar but distinct from the#[non_exhaustive]
attribute, since it works on an ABI level and applies in the defining module as well.Based on @thomcc's initial work on a similar RFC. This kind of functionality has also been discussed in the past on the internals forum, more recently on Zulip and possibly many more times before that.
Pretty sure this is
T-lang
purview:@rustbot label T-lang
Please remember to create inline comments for discussions to keep this RFC manageable and discussion trees resolveable.
Rendered.