Emit retags in codegen

# Proposal
This is part of project goal [#392](https://github.com/rust-lang/rust-project-goals/issues/392).

Both Stacked Borrows and Tree Borrows rely on retags to create and update the permissions associated with pointers. However, the information that Miri uses to determine where a retag should occur is lost during codegen. We need a way to recover this to be able to detect aliasing violations in lower-level representations of Rust programs. This is necessary to support third-party tools like [BorrowSanitizer](https://borrowsanitizer.com/), which aim at providing support for detecting Rust-specific undefined behavior in multilanguage programs.

## Design
We propose adding a new unstable flag `-Zcodegen-emit-retag`. When this flag is set, the following function call will be emitted whenever a retag needs to happen (shown in LLVM IR):
```llvm
ptr @__retag(ptr, i64, i64, ptr)
``` 
This is just a vehicle for type information; third party tools will replace this call with their own implementations. Its parameters are:

1. Target (`ptr`) - The pointer being retagged.

2. Size (`i64`) - An offset in bytes from the start of the pointer being retagged, indicating the range for the new permission within the allocation pointed to by the target.

3. Permission Type (`i64`) - The permission created by the retag. Third-party tools will be able to configure this by overriding a compiler query. You can expect this to be serialized from something equivalent to Miri's [`NewPermission`](https://doc.rust-lang.org/beta/nightly-rustc/miri/borrow_tracker/tree_borrows/struct.NewPermission.html).

4. Interior Mutable Fields (`ptr`) - A pointer to a constant array of pairs of `i64`. Each pair is an offset from the target pointer and a size, indicating an interior mutable field within the pointee type of the target.

The return value is an alias for the target pointer. Additional information can be encoded for convenience using LLVM metadata nodes attached to this function call. For example, function-entry retags will have a `fn_entry` metadata node. 

## Implementation Notes

We will determine where to emit retag function calls during codegen without using MIR [`Retag`](https://doc.rust-lang.org/beta/nightly-rustc/rustc_middle/mir/enum.StatementKind.html#variant.Retag) statements (which are likely [going away](https://github.com/rust-lang/unsafe-code-guidelines/issues/371#issuecomment-3346404729)). Broadly, parameters containing references are retagged on entry to a function, and values with references are retagged when they are copied by MIR assignment statements and function call terminators. When an aggregate contains references, we will recurse into its fields and branch on its variants as necessary. For more implementation details, take a look at this [pre-RFC](https://internals.rust-lang.org/t/pre-rfc-emit-retags-in-codegen/23706/7) and its discussion.

One key difference between Stacked and Tree Borrows is that under Stacked Borrows, raw pointers are retagged after being cast from references. Until a decision is made on this, we will attempt to retag these pointers. However, any retag can be skipped by having the compiler query that creates the "Permission Type" parameter return an `Option`. If the permission is `None`, then we will not emit a retag.

We have [a prototype implementation](https://github.com/BorrowSanitizer/rust/commit/c0f33b5503f7d972f0359ba0cefa5d7f819deac2) that we are currently testing in our ongoing development of BorrowSanitizer. This still needs to be modified to remove our dependency on MIR retag statements, though. 

### Extensions
Even if we move away from retagging raw pointers, we still expect that tool designers will want to be able to identify conversions between references and raw pointers at the LLVM level. This would make it possible to use a conservative static analysis to identify when we can skip run-time checks for allocations that are never accessed via raw pointers (see [LiteRSan](https://arxiv.org/abs/2509.16389) for a proof-of-concept using AddressSanitizer). As an optional extension to this MCP, we could identify these casts by emitting a second intrinsic:
```llvm
ptr @__expose_tag(ptr)
```
This creates an alias for its argument. Like retag intrinsics, it will need to be eliminated by third-party tools. We would also add a `from_raw` metadata annotation to retag intrinsics, indicating when references are created from raw pointers.

# Mentors or Reviewers

* @RalfJung
* @tmandry

# Process

The main points of the [Major Change Process][MCP] are as follows:

* [x] File an issue describing the proposal.
* [ ] A compiler team member who is knowledgeable in the area can **second** by writing `@rustbot second` or kickoff a team FCP with `@rfcbot fcp $RESOLUTION`.
    * Refer to [Proposals, Approvals and Stabilization](https://forge.rust-lang.org/compiler/proposals-and-stabilization.html) docs for when a second is sufficient, or when a full team FCP is required.
* [ ] Once an MCP is seconded, the **Final Comment Period** begins.
    * Final Comment Period lasts for 10 days after all outstanding concerns are solved.
    * Outstanding concerns will block the Final Comment Period from finishing. Once all concerns are resolved, the 10 day countdown is restarted.
    * If no concerns are raised after 10 days since the resolution of the last outstanding concern, the MCP is considered **approved**.

You can read [more about Major Change Proposals on forge][MCP].

[MCP]: https://forge.rust-lang.org/compiler/proposals-and-stabilization.html#how-do-i-submit-an-mcp


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Emit retags in codegen #958

Proposal

Design

Implementation Notes

Extensions

Mentors or Reviewers

Process

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Emit retags in codegen #958

Description

Proposal

Design

Implementation Notes

Extensions

Mentors or Reviewers

Process

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions