Skip to content

[RFC] Allow packed types to transitively contain aligned types #3718

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 10 commits into
base: master
Choose a base branch
from
Open
193 changes: 193 additions & 0 deletions text/0000-layout-packed-aligned.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,193 @@
- Feature Name: `layout_packed_aligned`
- Start Date: 2024-10-24
- RFC PR: [rust-lang/rfcs#3718](https://github.com/rust-lang/rfcs/pull/3718)
- Rust Issue: [rust-lang/rust#100743](https://github.com/rust-lang/rust/issues/100743)

# Summary
[summary]: #summary

This RFC deprecates the existing `#[repr(C)]` attribute and introduces two new variants of this attribute:

- `#[repr(C(target))]`, for structs intended for interoperability with operating system APIs
- `#[repr(C(system))]`, for structs intended for interoperability with libraries compiled for the current target

Compared to `#[repr(C)]`, these new attributes require the user to clarify their usage intent. This allows us to have nested structs that are:
- Both packed and aligned.
- Packed, and transitively contains`#[repr(align)]` types.
These usages were previously prohibited under [E0588](https://doc.rust-lang.org/nightly/error_codes/E0588.html).

Existing `#[repr(C)]` usages will emit a warning and default to `#[repr(C(target))]`.

# Motivation
Copy link
Contributor

@joshlf joshlf Apr 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we explicitly document that a goal of this RFC is to make sure that existing uses of #[repr(C)] are required to clarify their intent (ie, to indicate compatibility with a C compiler or to indicate a "linear" layout)? In particular, this implies that #[repr(C)] itself should be deprecated, and the two behaviors should both be supported by new reprs. That will ensure that code doesn't silently do either of the following:

  • Change its behavior (if we decide to change #[repr(C)] to mean "compatible with the C compiler")
  • Remain subtly buggy (if we decide to keep #[repr(C)]'s behavior, and that code was intended to be compatible with the C compiler)

Instead, deprecating #[repr(C)] will ensure that code is required to clarify its intent by upgrading to one of the newly-added reprs.

[motivation]: #motivation

This RFC enables the following struct definitions:

```rs
#[repr(C(target), packed(2), align(4))]
struct Foo { // Alignment = 4, Size = 8
a: u8, // Offset = 0
b: u32, // Offset = 2
}
```

This is commonly needed when Rust is being used to interop with existing C and C++ code bases, which may contain
unaligned types. For example in `clang` it is possible to create the following type definition, and there is
currently no easy way to create a matching Rust type:

```cpp
struct __attribute__((packed, aligned(4))) MyStruct {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is somewhat confusing that the Rust example uses packed(2) but the C example just uses packed. Would be better to make them equivalent.

uint8_t a;
uint32_t b;
};
```

Currently, `#[repr(packed(_))]` structs cannot be `#[repr(align(_))]` or transitively contain `#[repr(align(_))]` types. Attempting to do so results in a [hard error](https://doc.rust-lang.org/nightly/error_codes/E0588.html).

This behavior was added in the [original implementation](https://github.com/rust-lang/rust/issues/33158) of `#[repr(packed)]` due to concerns over differing behavior between MSVC and gcc/clang. This makes it cumbersome or even impossible to produce C-compatible struct layouts in Rust when the corresponding C types were annotated with both `packed` and `aligned`.

Although [The Rust reference](https://doc.rust-lang.org/reference/type-layout.html#the-c-representation) documents the meaning
of `repr(C)` quite clearly (types are laid out linearly, according to a fixed algorithm.), when you see `#[repr(C)]` in code,
its meaning can be somewhat ambiguous. Their intention could be one of three things:
1. Having a target-independent and stable representation of the data structure for storage or transmission.
2. FFI with C and C++ libraries compiled for the same target.
3. Interoperability with operating system APIs.

Previously, `#[repr(C)]` was being used for all 3 scenarios because [E0588](https://doc.rust-lang.org/nightly/error_codes/E0588.html) prohibits the user from creating a `#[repr(C)]` struct with ambiguous layout between targets.
This RFC seeks to differentiate between 2 and 3, leaving 1 for a Rust-defined linear layout to be addressed in a separate RFC.


# Guide-level explanation
[guide-level-explanation]: #guide-level-explanation

## `#[repr(C(target))]`
Structs annotated with this attribute are guaranteed to have the same layout as a struct produced by the C compiler for the current target toolchain.
This is useful for interfacing with libraries compiled for the current target.

For example, given:
```c
#[repr(C, align(4))]
struct Foo(u8);
#[repr(C, packed(1))]
struct Bar(Foo);
```
`align_of::<Bar>()` would be 4 for `*-pc-windows-msvc` and 1 for everything else, matching the target toolchain (MSVC).


## `#[repr(C(system))]`
Structs annotated with this attribute are guaranteed to have the same layout as a struct defined by the target OS ABI.
This is useful for interfacing with operating system APIs.

For example, given:
```c
#[repr(system, align(4))]
struct Foo(u8);
#[repr(system, packed(1))]
struct Bar(Foo);
```
`align_of::<Bar>()` would be 4 for `*-pc-windows-msvc` and `*-pc-windows-gnu`. It would be 1 for everything else. This matches the target OS (windows).

# Reference-level explanation
[reference-level-explanation]: #reference-level-explanation

In the following paragraphs, "Decreasing M to N" means:
```
if M > N {
M = n
}
```

"Increasing M to N" means:
```
if M < N {
M = N
}
```


`#[repr(align(N))]` increases the base alignment of a type to be N.

`#[repr(packed(M))]` decreases the alignment of the struct fields to be M. Because the base alignment of the type
is defined as the maximum of the alignment for any fields, this also has the indirect result of decreasing the base
alignment of the type to be M.

When the align and packed modifiers are applied on the same type as `#[repr(align(N), packed(M))]`,
the alignment of the struct fields are decreased to be M. Then, the base alignment of the type is
increased to be N.

When a `#[repr(packed(M))]` struct transitively contains a field with `#[repr(align(N))]` type, depending on the
target triplet, either:
Copy link
Member

@RalfJung RalfJung Oct 31, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should IMO be rephrased as an algorithm that works one "layer" at a time. Computing the layout of a type T should only consider the fields of T and their properties. It should never recurse into the fields of T.

I don't currently actually understand the proposed spec here, and this should make it a lot easier to understand.

I suspect what will happen is that as part of this, we will have to introduce a new property of T that is "bubbled up" in the recursion -- a new degree of freedom that was not required so far. (@CAD97 has already alluded to this elsewhere.) Identifying and clearly describing this property will make it a lot easier to understand the resulting layout algorithm.

Copy link
Member

@RalfJung RalfJung Nov 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My current understanding of the algorithm is as follows:


We equip types with a new property, the "explicitly requested alignment". For all base types, this alignment is 1. For structs, it is by default the maximum of the explicitly requested alignments of all fields. For a struct with the #[repr(align(N))] attribute, the explicitly requested alignment the maximum of N and the explicitly requested alignments of its fields.

Note that the explicitly requested alignment of a type can never be bigger than the required alignment of the type.

When computing the layout of a packed(P) struct, then currently we ensure each field is aligned to min(A, P), where A is the (regular) alignment of the field type. Under the new rules, only on MSVC targets, we instead ensure the field is aligned to max(E, min(A, P)), where E is the explicitly requested alignment. (Due to the aforementioned inequality, this is equivalent to min(max(E, P), A). Also, in particular the packed struct has at least this alignment itself.) This is the only time the explicitly requested alignment of a type has any effect.


I am not fully confident that this is correct. Here's a corner case:

#[repr(C, align(4))]
struct Align4(i32);

#[repr(C, align(2))]
struct Align2(Align4);

#[repr(C, packed)]
struct Packed(Align2);

What is the resulting alignment of Packed on MSVC? My proposed algorithm says 4. Is that correct?

Copy link
Member

@RalfJung RalfJung Nov 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another corner case:

#[repr(C, align(4))]
struct Align4(i32);

struct Group(u8, Align4);

#[repr(C, packed)]
struct Packed(u16, Group);

What does the resulting layout of Packed look like? My algorithm says:

  • field 0 (type u16): offset 0
  • field 1 (type Group): offset 4
    • nested field 0 (type u8): offset 4 (relative to the beginning of Packed)
    • nested field 1 (type Align4): offset 8

Is that correct?

Copy link
Member

@RalfJung RalfJung Nov 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(The post this referred to has since been deleted.)

Oh, so having an align attribute on a struct where a field already has a higher explicitly requested alignment is an error? Should this also be an error in Rust? The RFC doesn't say so, and it would be a breaking change... but that should at least be mentioned in the RFC. It might be worth a warning if someone writes a repr(C) type in Rust that couldn't be written in C.

Copy link

@CAD97 CAD97 Nov 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://c.godbolt.org/z/es1cGPhz9

On MSVC 19.40 (VS 17.10) in C mode,

#include <stdalign.h>
#include <stddef.h>
#include <stdint.h>
#include <stdio.h>

__declspec(align(4))
struct Align4
{
    int32_t _0;
};

__declspec(align(2))
struct Align2
{
    struct Align4 _0;
};

#pragma pack(push, 1)
struct Packed
{
    struct Align2 _0;
};
#pragma pack(pop)

struct Group
{
    uint8_t _0;
    struct Align4 _1;
};

#pragma pack(push, 1)
struct P
{
    uint16_t _0;
    struct Group _1;
};
#pragma pack(pop)

int main()
{
    printf("alignof(Packed) = %zu\n", alignof(struct Packed));
    printf("\n");
    printf("offsetof(P, _0) = %zu\n", offsetof(struct P, _0));
    printf("offsetof(P, _1) = %zu\n", offsetof(struct P, _1));
    printf("offsetof(P, _1._0) = %zu\n", offsetof(struct P, _1) + offsetof(struct Group, _0));
    printf("offsetof(P, _1._1) = %zu\n", offsetof(struct P, _1) + offsetof(struct Group, _1));
}

gives

alignof(Packed) = 4

offsetof(P, _0) = 0
offsetof(P, _1) = 4
offsetof(P, _1._0) = 4
offsetof(P, _1._1) = 8

Using alignas/_Alignas on the type (alignas(N) struct Tag) instead of __declspec(align) gives

warning C5274: behavior change: _Alignas no longer applies to the type 'Align4' (only applies to declared data objects)

and results of 1 / 0, 2, 2, 6; fully just ignoring the alignas modifier. Note that standard C does not permit the use of alignas for struct definitions. C++ does. In C++ mode (/std:c++latest, using struct alignas(N) Tag), MSVC gives:

alignof(Packed) = 4

offsetof(P, _0) = 0
offsetof(P, _1) = 4
offsetof(P, _1._0) = 4
offsetof(P, _1._1) = 8

along with a warning:

warning C4359: 'Align2': Alignment specifier is less than actual alignment (4), and will be ignored.

EDIT TO ADD: interesting: in C mode, __declspec(align(2)) struct Align2 gives no warning, but struct __declspec(align(2)) Align2 gives the same warning as in C++ mode. Odd. The standard C compliant way of writing the alignment (alignas on the first data member) also does not warn, in C nor C++ mode.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For completeness, clang does not seem to implement this fully, unless I made a mistake: [godbolt]

#include <stdalign.h>
#include <stddef.h>
#include <stdint.h>
#include <stdio.h>

struct [[gnu::ms_struct]] Align4
{
    alignas(4)
    int32_t _0;
};

struct [[gnu::ms_struct]] Align2
{
    // alignas(2) // error: requested alignment is less than minimum alignment of 4 for type 'struct Align4'
    struct Align4 _0;
};

struct [[gnu::packed]] Packed
{
    struct Align2 _0;
};

struct Group
{
    uint8_t _0;
    struct Align4 _1;
};

struct [[gnu::packed]] P
{
    uint16_t _0;
    struct Group _1;
};

int main()
{
    printf("alignof(Packed) = %zu\n", alignof(struct Packed));
    printf("\n");
    printf("offsetof(P, _0) = %zu\n", offsetof(struct P, _0));
    printf("offsetof(P, _1) = %zu\n", offsetof(struct P, _1));
    printf("offsetof(P, _1._0) = %zu\n", offsetof(struct P, _1) + offsetof(struct Group, _0));
    printf("offsetof(P, _1._1) = %zu\n", offsetof(struct P, _1) + offsetof(struct Group, _1));
}
alignof(Packed) = 1

offsetof(P, _0) = 0
offsetof(P, _1) = 2
offsetof(P, _1._0) = 2
offsetof(P, _1._1) = 6

- The field is added to the struct with alignment decreased to M. The packing requirement overrides the alignment requirement. (This is the case for GCC, `#[repr(C(target))]` on gnu targets, and `#[repr(C(system))]` on non-windows targets.)
- The field is added to the struct with alignment decreased to M and then increased to N. The alignment requirement overrides the packing requirement. (This is the case for MSVC, `#[repr(C(target))]` on msvc targets, `#[repr(C(system))]` on windows targets.)

# Drawbacks
[drawbacks]: #drawbacks
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Due to this, this RFC will actually change the layout of some types that are currently accepted on stable, on MSVC targets. That should be discussed as a drawback.


It's worthy to note that while this RFC does require people to stop treating `repr(C)` as a linear layout but rather as an
ABI compatiblity layout, it is not our intention to propose a breaking change: `packed` structs are previously banned from
transitively containing `aligned` fields, so the proposed default `repr(C(target))` will have structs laid out in exactly the same
way as it did before. However, due to an oversight in the current implementation of the Rust compiler, the restriction
can actuall be
[circumvented](https://github.com/rust-lang/rust/issues/100743#issuecomment-1229343705) using generics. Applications
using this pattern to circumvent the restriction may see a change in the struct layout on MSVC targets.

This RFC alone still doesn't make `repr(C(target))` fully match the target (MSVC) toolchain in all cases; the known other
divergences are enums with overflowing discriminant and how a field of type [T; 0] is handled. So while this does
improve parity, the reality is that there are still edge cases to keep track of for now. These cases shall be addressed
in future RFCs.



# Rationale and alternatives
[rationale-and-alternatives]: #rationale-and-alternatives

This RFC clarifies that:
- `repr(C(target))` must interoperate with the C compiler for the target.
- `repr(C(system))` must interoperate with the operating system APIs for the target.
- Similiar to Clang, `repr(C)` does not guarantee consistent layout between targets.

Alternatively, we can also create syntax that allows the user to specify exactly which semantic to use when packed structs transitively contains aligned fields.
For example, a new attribute: #[repr(align_override_packed(N))] that can be used when the behavior of the child overriding the parent alignment is desired.

#[repr(align(N))] #[repr(packed)] can be used together to get the opposite behavior, parent/outer alignment wins.

Explicitly specifying the pack/align semantic has the drawback of complicating FFI. For example, you might need two different definition files depending on the target.

Therefore, a stable layout across compilation target should be relegated as future work.




# Prior art
[prior-art]: #prior-art

Clang matches the Windows ABI for `x86_64-pc-windows-msvc` and matches the GCC ABI for `x86_64-pc-windows-gnu`.

MinGW always uses the GCC ABI.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there prior art for a compiler that can lay out types both using the Windows ABI and the GCC ABI for code within a single target? If yes, how are they distinguishing the two? If no, why does Rust need this ability?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

gcc apparently supports that by using the ms_struct attribute

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So that would correspond tom in Rust

  • have repr(C) on win-gnu targets match non-win targets
  • have a separate window-only repr(MS) to ask for the msvc layout

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For a more full parallel to extern "ABI", we should also support repr(GCC). Then repr(C) is a sort of alias to repr(GCC) or repr(MS) chosen by the target, like extern "C" is an alias (strongly newtyped) to "sysv64"/"win64" (etc).

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


We already have both `C` and `system` [calling conventions](https://doc.rust-lang.org/beta/nomicon/ffi.html#foreign-calling-conventions)
to support differing behavior on `x86_windows` and `x86_64_windows`.


This issue was introduced in the [original implementation](https://github.com/rust-lang/rust/issues/33158) of `#[repr(packed(N))]` and have since underwent extensive community discussions:
- [#[repr(align(N))] fields not allowed in #[repr(packed(M>=N))] structs](https://github.com/rust-lang/rust/issues/100743)
- [repr(C) does not always match the current target's C toolchain (when that target is windows-msvc)](https://github.com/rust-lang/unsafe-code-guidelines/issues/521)
- [repr(C) is unsound on MSVC targets](https://github.com/rust-lang/rust/issues/81996)
- [E0587 error on packed and aligned structures from C](https://github.com/rust-lang/rust/issues/59154)
- [E0587 error on packed and aligned structures from C (bindgen)](https://github.com/rust-lang/rust-bindgen/issues/1538)
- [Support for both packed and aligned (in repr(C)](https://github.com/rust-lang/rust/issues/118018)
- [bindgen wanted features & bugfixes (Rust-for-Linux)](https://github.com/Rust-for-Linux/linux/issues/353)
- [packed type cannot transitively contain a #[repr(align)] type](https://github.com/rust-lang/rust-bindgen/issues/2179)
- [structure layout using __aligned__ attribute is incorrect](https://github.com/rust-lang/rust-bindgen/issues/867)


# Unresolved questions
[unresolved-questions]: #unresolved-questions

None for now.


# Future possibilities
[future-possibilities]: #future-possibilities

People intending for a stable struct layout consistent across targets would be directed to use `crABI`.