-
Notifications
You must be signed in to change notification settings - Fork 1.6k
repr(ordered_fields) #3845
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
repr(ordered_fields) #3845
Changes from 8 commits
e0cdce8
0637bbf
7e4b0c8
61ce0f2
4fa572e
755644c
0d0a79b
4f4bf18
063af08
0c0e429
9e316ff
a1700b6
d75a497
6526f5a
ed96c1b
01cfb40
f11567c
b17f192
488068e
93f53a5
a2737d5
bb8a392
612b99e
283df46
8fe1576
489b31a
a5fb9bc
88f631e
dee831d
f007f6f
703dcb9
2e36454
472cae7
6c48b17
2043a02
4c81c7c
92eb763
9fb58ad
6a16195
e7bd72a
7d7b01a
81a1a42
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change | ||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
@@ -0,0 +1,318 @@ | ||||||||||||||||
- Feature Name: `repr_ordered_fields` | ||||||||||||||||
- Start Date: 2025-08-05 | ||||||||||||||||
- RFC PR: [rust-lang/rfcs#3845](https://github.com/rust-lang/rfcs/pull/3845) | ||||||||||||||||
- Rust Issue: [rust-lang/rust#0000](https://github.com/rust-lang/rust/issues/0000) | ||||||||||||||||
|
||||||||||||||||
# Summary | ||||||||||||||||
[summary]: #summary | ||||||||||||||||
|
||||||||||||||||
Introduce a new `repr` (let's call it `repr(ordered_fields)`, but that can be bikeshedded if this RFC is accepted) that can be applied to `struct`, `enum`, and `union` types, which guarantees a simple and predictable layout. Then provide an initial migration plan to switch users from `repr(C)` to `repr(ordered_fields)`. | ||||||||||||||||
|
||||||||||||||||
# Motivation | ||||||||||||||||
[motivation]: #motivation | ||||||||||||||||
|
||||||||||||||||
Currently `repr(C)` serves two roles | ||||||||||||||||
1. Provide a consistent, cross-platform, predictable layout for a given type | ||||||||||||||||
2. Match the target C compiler's struct/union layout algorithm and ABI | ||||||||||||||||
Comment on lines
+19
to
+21
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Big fan of doing this split, especially for Pondering the bikeshed: (This could be contrasted with other potential reprs that I wouldn't expect this RFC to add, but could consider as future work, like a There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think this is also useful for unions, so we don't need to rely on
This also works as an argument against names like There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I do think that It makes me ponder whether we should just have (Which makes me think of other things like addressing rust-lang/unsafe-code-guidelines#494 by having a different constructs for "bag of maybeuninit stuff that overlap" vs "distinct options with an active-variant rule for enum-but-without-stored-discriminant". But those are definitely not this RFC.) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't mind spelling it as |
||||||||||||||||
|
||||||||||||||||
But in some cases, these two cases are in tension due to platform weirdness (even on major platforms like Windows MSVC) | ||||||||||||||||
* https://github.com/rust-lang/unsafe-code-guidelines/issues/521 | ||||||||||||||||
* https://github.com/rust-lang/rust/issues/81996 | ||||||||||||||||
|
||||||||||||||||
Providing any fix for case 2 would subtly break any users of case 1, which makes this difficult to fix within a single edition. | ||||||||||||||||
|
||||||||||||||||
As an example of this tension: on Windows MSVC, `repr(C)` doesn't always match what MSVC does for ZST structs (see this [issue](https://github.com/rust-lang/rust/issues/81996) for more details) | ||||||||||||||||
RustyYato marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||||||||||||||||
|
||||||||||||||||
```rust | ||||||||||||||||
// should have size 8, but has size 0 | ||||||||||||||||
#[repr(C)] | ||||||||||||||||
struct SomeFFI([i64; 0]); | ||||||||||||||||
``` | ||||||||||||||||
|
||||||||||||||||
Of course, making `SomeFFI` size 8 doesn't work for anyone using `repr(C)` for case 1. They want it to be size 0 (as it currently is). | ||||||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Turns out this was not entirely right, see rust-lang/unsafe-code-guidelines#552 (comment). So probably best to remove this paragraph again. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That issue should presumably be addressed in this RFC, no? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The main problem is that we have already specified that all 1-ZST have the same ABI. We need to figure out how to clean that up but I don't think this RFC needs to do that. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This RFC currently specifies that new There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I did see that you claimed this before in rust-lang/unsafe-code-guidelines#552. I'm not sure what this is referencing. If this is referencing the current implementation where 1-ZSTs are explicitly skipped in the ABI. I'm not seeing any documentation guaranteeing this behavior, so it should be fine to change this. If this is referencing this: https://internals.rust-lang.org/t/creating-1-zsts-guaranteed-to-have-same-extern-c-abi-as/19399/18
This seems rather subtle, and it should be explicitly documented somewhere (perhaps the Rust reference on type layouts/abi). And if this is indeed your reasoning, then we could (in future editions) ensure that There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes it is a subtle indirect consequence, as explained at the top of rust-lang/unsafe-code-guidelines#552:
If
Yes we could. Though ABI is a cross-edition concern so it'd only help marginally to do this on new editions only. But we could try to crater whether just doing a transition for this is realistic... There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
I assume it would be based on the edition where the // on new edition with new `repr(C)`
#[repr(C)]
struct NoFields {}
#[repr(transparent)]
struct Foo(u32, NoFields);
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Usually it'd be based on the edition of the crate that the repr(transparent) type is defined in.
Yeah, also see rust-lang/unsafe-code-guidelines#552 (comment). There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Well, yes, the whole point of this RFC is that changing the meaning of |
||||||||||||||||
|
||||||||||||||||
# Guide-level explanation | ||||||||||||||||
[guide-level-explanation]: #guide-level-explanation | ||||||||||||||||
|
||||||||||||||||
`repr(ordered_fields)` is a new representation that can be applied to `struct`, `enum`, and `union` to give them a consistent, cross-platform, and predictable in memory layout. | ||||||||||||||||
|
`repr(ordered_fields)` is a new representation that can be applied to `struct`, `enum`, and `union` to give them a consistent, cross-platform, and predictable in memory layout. | |
`repr(ordered_fields)` is a new representation that can be applied to `struct`, `enum`, and `union` to give them a consistent, cross-platform, and predictable in-memory layout. |
"cross-platform" -- the layout will differ when there are different layouts for struct members' types, in particular primitive types can have different alignments which changes the amount of padding.
e.g., #[repr(ordered_fields)] struct S(u8, f64);
doesn't have the same layout on x86_64 and i686
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, this will need to be documented as a hazard in the ordered_fields
docs. However, the repr
itself will be cross-platform. For example, #[repr(ordered_fields)] struct Cross([u8; 3], SomeEnum);
will be truly cross-platform (given that SomeEnum
is!).
RustyYato marked this conversation as resolved.
Show resolved
Hide resolved
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is too noisy. Most code out there using repr(C) is probably fine - IIUC, if you're not targeting Windows or AIX, maybe definitely fine? - and having a bunch of allow(...) across a bunch of projects seems unfortunate.
Maybe we can either (a) only enable the lint for migration, i.e., the next edition's cargo fix would add allows for you or (b) we find some new name... C2 for the existing repr(C) usage to avoid allows. But (b) also seems too noisy to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe it could just be an optional edition compatibility lint, so if someone enables e.g. rust_20xx_compatibility
it shows up but otherwise not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(a) only enable the lint for migration
That was the intention, hence the name edition_2024_repr_c
. I'll make this more clear, that this is intended to be a migration lint.
Rustfix would update to #[repr(ordered_fields)]
to preserve the current behavior. For the FFI crates, #![allow(edition_2024_repr_c)]
at the top of lib.rs
would suffice. If you have a mix of FFI and non-FFI uses of repr(C)
, then you'll have to do the work to figure out which is which, no matter what option is chosen to update repr(C)
- even adding repr(C2)
, since then the FFI use case would need to update all their reprs to repr(C2)
.
Overall, I think this scheme only significantly burdens those who have a mix of FFI and non-FFI uses of repr(C)
. But they were going to be burdened no matter what option was chosen.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the new wording/lints too noisy still?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the new wording is still too noisy. We shouldn't assume that most people using repr(C)
are using it for ordering rather than FFI.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That wasn't my intention, but I don't see another way to do all of the following in the next edition:
- make
repr(C)
mean - same layout/ABI as what the standard C compiler does - make
repr(ordered_fields)
- the same algorithm that's listed forrepr(C)
in the Rust reference - ensure that everyone who upgrades to the next edition gets the layout they need (as long as they read the warnings and follow the given advice)
- make it as painless as possible for people who don't mix FFI and stable ordering cases (which I suspect is the vast majority of people). In other words, each crate currently uses
repr(C)
either exclusively for FFI or exclusively for some stable layout. - for people who do mix FFI and stable ordering cases in one crate, at least the warning should give them all the places they need to double-check, and they can silence the warning on a case-by-case basis.
I'm open to suggestions on how to handle the diagnostics. Within these constraints, I think my solution is the only real option we have. If there are some objections to these constraints, I would like to hear those too, maybe I missed the mark with these constraints, and missed a potential solution because of it.
RustyYato marked this conversation as resolved.
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What should repr(C)
do when the corresponding C code is rejected by the default target C compiler?
This affects, for instance, fieldless structs, which are rejected by MSVC (but accepted by GCC). By extension it then also affects structs where all fields are PhantomData
, as those should arguably be removed when translating a Rust type to a C type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should specify that repr(Rust)
or repr(ordered_fields)
1-ZSTs (incl. tuples) are always removed, but otherwise if there is no direct C equivalent (e.g. repr(Rust)
or repr(ordered_fields)
fields, or fieldless struct if not supported by the C compiler), the layout should be consistent within the same compiler version but otherwise unspecified.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added this as an unresolved question, so we don't lose track of this.
I agree that we should remove any repr(Rust)
/repr(ordered_fields)
1-ZSTs, but if we have a type that otherwise doesn't have an equivalent supported C type, maybe we should either error or try working with the C community to close this gap (i.e. fieldless types). Either by ensuring that they never make it compile in the future and giving us free rein to give semantics, or by making the code compile with some layout/ABI and then matching it ourselves (this would be preferred).
If the type has no equivalent (i.e. some repr(Rust)
/repr(ordered_fields)
), then either the layout is unspecified or we could treat it some opaque blob with the same size and align, and lay it out how a C compile would lay out an struct containing an array of the same size and align. (i.e. __attribute__((aligned(ALIGN))) struct Blob { char x[SIZE] };
). This seems like it would be the most consistent.
We've been bitten before by making choices about what the C code ought to do and had to do some painful roll-backs (like this RFC, and env::set_var
). So I think being more conservative around C code is probably a good thing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
or we could treat it some opaque blob with the same size and align, and lay it out how a C compile would lay out an struct containing an array of the same size and align. (i.e. attribute((aligned(ALIGN))) struct Blob { char x[SIZE] };)
I don't think we should promise any particular layout -- by leaving it unspecified, we at least de jure can change what we do to match what C does if it ever becomes legal in C.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think would be nice to guarantee that if a repr(ordered_fields)
ADT is layout-compatible (same size, alignment, and field offsets, discounting repr(Rust)
ZST fields) with a struct
or union
defined in C, then it is also ABI-compatible with that C struct
/union
, as far as possible. This would allow using repr(ordered_fields)
to interoperate with C code that defines types with weird pragma
s, like AIX #pragma align (natural)
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is possible, and I've listed something similar as an unresolved question (should repr(ordered_fields)
have a well-defined ABI?).
However, I think it would be best to punt this to a future RFC/ACP/discussion/etc. It doesn't seem to be required to split repr(C)
. Matching a weird pragma on a Tier 3 platform is not enough motivation to add to this RFC. (As I would like to keep this as lean as possible, only keeping the necessary portions)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that we currently do have an internal linear repr which is used for Box
and that does not inhibit ABI optimizations, that's necessary for box to act as a pointer type.
RustyYato marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
RustyYato marked this conversation as resolved.
Show resolved
Hide resolved
RustyYato marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
RustyYato marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
RustyYato marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This statement matches the algorithm currently used for "primitive representations" (repr(u8)
etc), not the algorithm currently used for repr(C)
or repr(C, uN)
. If this change is made, then anyone in need of specifically the current repr(C)
enum representation cannot migrate to repr(ordered_fields)
.
The difference between the two algorithms (besides the integer type used for the tag) is that in repr(C)
, all variants have their first field at the same offset, whereas in repr(uN)
, each variant’s first field has only the offset required by its type’s alignment, so individual variants may be at a lower offset. The repr(C)
enum acts as (and is documented as) a “tagged union” that contains an “untagged union”, whereas repr(uN)
does not.
Example of the difference:
#![feature(offset_of_enum)]
#[repr(align(8))]
struct A64(u64);
#[repr(C, i32)]
enum ExampleC {
Foo(u8),
Bar(A64),
}
#[repr(i32)]
enum ExamplePrim {
Foo(u8),
Bar(A64),
}
fn main() {
dbg!(
std::mem::offset_of!(ExampleC, Foo.0), // prints 8
std::mem::offset_of!(ExamplePrim, Foo.0), // prints 4
);
}
I think that either repr(ordered_fields, uN)
should instead exactly mimic repr(C, uN)
in this regard, or — perhaps a better idea — there should be a separate explicitly named representation for this, since it is rare to actually need this layout, and so users merely seeking predictable layout are better served by repr(uN)
, but they might unthinkingly put repr(ordered_fields)
on their enums along with their structs..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good point. I didn't realize there was that difference, thanks for bringing this up. I think repr(ordered_fields)
should match what repr(C)
does now, to ease migrations (even in combo with the primitive reprs). We could add lints to suggest changing to just a primitive repr if repr(ordered_fields)
is used on an enum.
I think making the migration as painless as possible will make it much easier to split up repr(C)
up. And we can also add guidance towards other reprs via lints.
I'm open to having the repr be named differently for the different kinds of type. (For example, repr(offset_zero)
for unions was considered in another thread)
Uh oh!
There was an error while loading. Please reload this page.