Skip to content

RFC: Disable niche layout optimization on enum discriminants #3803

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

madsmtm
Copy link
Contributor

@madsmtm madsmtm commented Apr 21, 2025

Allow #[repr(_, no_niches)] on field-less (C-style) enums to disable the niche layout optimization on discriminants, allowing using the enum directly in FFI.

Enums with the no_niches modifier cannot be matched on exhaustively without the use of a wildcard arm. This feature is similar but distinct from the #[non_exhaustive] attribute, since it works on an ABI level and applies in the defining module as well.


Based on @thomcc's initial work on a similar RFC. This kind of functionality has also been discussed in the past on the internals forum, more recently on Zulip and possibly many more times before that.

Pretty sure this is T-lang purview:
@rustbot label T-lang

Please remember to create inline comments for discussions to keep this RFC manageable and discussion trees resolveable.

Rendered.

@rustbot rustbot added the T-lang Relevant to the language team, which will review and decide on the RFC. label Apr 21, 2025
Comment on lines 474 to 499
## Syntax

Current options and sub-options:
1. Use `#[repr(_, something_here)]`.
Advantage: The feature semantically modifies the layout/representation.
1. `#[repr(_, no_niches)]`.
Advantage: Very precise in what it does.
Disadvantage: Uses "niche" which is a niche term.
2. `#[repr(_, non_exhaustive)]`.
Advantage: Clear that it implies `#[non_exhaustive]`.
Disadvantage: Might be confusing for users which kind of `non_exhaustive` they should use.
3. `#[repr(_, abi_stable)]`.
3. `#[repr(_, all_bits_valid)]`.
4. `#[repr(_, discriminant = no_niches)]` to mirror [RFC 3659](https://github.com/rust-lang/rfcs/pull/3659).
2. Use `#[non_exhaustive(something_here)]`.
Advantage: Might be easier to explain the effect to users ("this works just like `#[non_exhaustive]`, except stronger").
Advantage: Might align better with future additions to `#[non_exhaustive]`, such as [`#[non_exhaustive(pub)]`](https://internals.rust-lang.org/t/pre-rfc-relaxed-non-exhaustive-structs/11977).
1. `#[non_exhaustive(no_niches)]`.
2. `#[non_exhaustive(abi)]`.
3. `#[non_exhaustive(repr)]`.
4. `#[non_exhaustive(layout)]`.
3. New attribute `#[really_non_exhaustive]` or similar.

We have a kind of decision tree here, where some unresolved questions depend on the syntax. The exact syntax does not need to be decided before accepting the RFC, though we should choose one of the main "branches".

The RFC author himself is undecided on the syntax, but decided to go with `#[repr(_, no_niches)]` for the body of the RFC, to make the desired semantics clearer to the RFC reader.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be T-opsem purview in terms of "niche" vs. "ABI" vs. "non_exhaustive" terminology? See also rust-lang/unsafe-code-guidelines#122.

# Summary
[summary]: #summary

Allow `#[repr(_, no_niches)]` on field-less (C-style) enums to disable the niche layout optimization on discriminants, allowing using the enum directly in FFI.
Copy link
Contributor

@kpreid kpreid Apr 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think “no niche optimization” is a poor framing here, because in general “disable optimization” isn’t a sturdy foundation on which to build anything, and also because talking to me about niches doesn’t tell me about the exhaustiveness change unless I think about it hard. I think it would be better to:

  • Call the repr modifier something like “allow_undeclared_variants”.
  • Reframe the RFC text around “this makes it allowed (not UB) to have a value of an undeclared variant” instead of about niches.

P.S. To put it differently: Don’t say "this turns off the thing that is incompatible with unknown variants", say "this allows you to use unknown variants" — it's more straightforward.

Copy link
Member

@thomcc thomcc Apr 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is why my initial (unpublished, but linked above) RFC called it #[repr(non_exhaustive)] 😉.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about the clang's terminology: #[open]

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I especially think no_niches is the wrong framing for this, because the main use case is for FFI, where niches aren't used anyway, since neither passing the enum directly nor using it in a repr(c) struct allow use of any niches.

I think the salient point here not that the compiler can't use niches for the type, but that the behavior for discrimant values outside of the defined variants is still well defined, that is it doesn't trigger Undefined Behavior.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about the clang's terminology: #[open]

I proposed #[repr(open)] elsewhere on this PR, but thinking of it a bit more, I think #[open] might be a more appropriate option because it affects the semantics of the enum in a deeper way than, say, #[repr(C)] or #[repr(u8)].

}
```

This would be a fairly large bug, since the `Weather` enum suddenly grew to 2 bytes in size (instead, the `Unknown` variant should also have been updated to `u8 in 4..=255`).
Copy link
Contributor

@zachs18 zachs18 Apr 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Note that, as declared, Weather would already have been 2 bytes, since #[repr(u8)] explicitly separates the discriminant from the variants' fields, even if the discriminant could "fit" in a field's niche)

Maybe use like #[repr(bikeshed_guarantee_abi(u8))] or something to be clear that Weather is intended to be u8-like, not "fieldful enum"-like


The exact output of `#[derive(Debug)]` is [an unresolved question][derivedebug].

User-defined derives and proc-macros will have to know when to emit the extra match arm (so most user-defined derive macros probably won't support `#[repr(_, no_niches)]` initially). Proc-macros could either choose to inspect the `repr`, or to defensively emit an extra `#[allow(unreachable_patterns)] _ => {}` match arm.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it doesn't exist already, it would be useful to publish a zero-dependency crate to inspect reprs and match at least the most common ones so that not every proc macro author needs to reinvent the wheel. Such a crate could then be mentioned here.


## Changing the default

Making `#[repr(C)]` and `#[repr(uXX)]` enums have the `no_niches` modifier by default across an edition boundary could make interoperability with C easier and even further remove the footgun (anecdotally, I've seen several people surprised that `#[repr(C)] enum` isn't a good idea for interfacing with C enums).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would argue that repr(C) should default to no niches, but repr(uN) should still default to having niches, since it's effectively repr(Rust, uN) which should default to niches.


Making `#[repr(C)]` and `#[repr(uXX)]` enums have the `no_niches` modifier by default across an edition boundary could make interoperability with C easier and even further remove the footgun (anecdotally, I've seen several people surprised that `#[repr(C)] enum` isn't a good idea for interfacing with C enums).

Would need some way to re-opt-in to niches, maybe `#[repr(_, niches)]`?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Honestly, I would much prefer something along the lines of niches(on) and niches(off), or just having both no_niches and niches available when this feature is added, for forward-compatibility. This way, people can explicitly opt into having niches without worrying about whether the default is changed in a future edition.

Comment on lines +481 to +485
1. `#[repr(_, no_niches)]`.

Advantage: Very precise in what it does.

Disadvantage: Uses "niche" which is a niche term.
Copy link

@clarfonthey clarfonthey Apr 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also don't like no_niches because it feels like a double-negative. To me, the purpose of the "niches" representation is so that I can be sure that extraneous values for my enum are literally impossible, not that these extraneous values are "niches" that can be filled by wrapper types like Option.

For example, you can simplify a lot of operations when you know the values an enum can have explicitly. For example, if I have an enum for days of the week with a tag between 0-6, I know that if I take some number of days % 7, it will always become valid for that enum.

I don't know if there's a good alternative to "niche" as a name and don't think it's worth stalling the RFC for, but I figured it'd be worth voicing some of the downsides of the existing name, in case people do think it's worth changing for that reason.


3. `#[repr(_, abi_stable)]`.
3. `#[repr(_, all_bits_valid)]`.
4. `#[repr(_, discriminant = no_niches)]` to mirror [RFC 3659](https://github.com/rust-lang/rfcs/pull/3659).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about #[repr(_, open)] as another option (after the option for Clang’s enum_extensibility attribute)?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If repr(open) could remove the need for crates like https://crates.io/crates/open-enum that would be cool

madsmtm added a commit to madsmtm/wgpu that referenced this pull request Apr 25, 2025
The `metal` crate is currently unsound regarding unknown/future enum
variants, see gfx-rs/metal-rs#209 and
rust-lang/rfcs#3803.

`objc2-metal` fixes this by emitting C enums as a newtype + constants
for each variant, but that prevents us from importing the
variants/constants. So this commit converts to a pattern that works with
that in preparation for the migration.
cwfitzgerald pushed a commit to gfx-rs/wgpu that referenced this pull request Apr 25, 2025
The `metal` crate is currently unsound regarding unknown/future enum
variants, see gfx-rs/metal-rs#209 and
rust-lang/rfcs#3803.

`objc2-metal` fixes this by emitting C enums as a newtype + constants
for each variant, but that prevents us from importing the
variants/constants. So this commit converts to a pattern that works with
that in preparation for the migration.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
T-lang Relevant to the language team, which will review and decide on the RFC.
Projects
None yet
Development

Successfully merging this pull request may close these issues.