Skip to content

Commit 7bdd872

Browse files
authored
Add support for -Zembed-metadata (#15378)
### What does this PR try to resolve? This PR adds Cargo integration for the new unstable `-Zembed-metadata` rustc flag, which was implemented in rust-lang/rust#137535 ([tracking issue](rust-lang/rust#139165)). The new behavior has to be enabled explicitly using a new unstable CLI flag `-Zno-embed-metadata`. The `-Zembed-metadata=no` rustc flag can reduce disk usage of compiled artifacts, and also the size of Rust dynamic library artifacts shipped to users. However, it is not enough to just pass this flag through `RUSTFLAGS`; it needs to be integrated within Cargo, because it interacts with how the `--emit` flag is passed to rustc, and also how `--extern` args are passed to the final linked artifact build by Cargo. Furthermore, using the flag for all crates in a crate graph compiled by Cargo would be suboptimal (this will all be described below). When you pass `-Zembed-metadata=no` to rustc, it will not store Rust metadata into the compiled artifact. This is important when compiling libs/rlibs/dylibs, since it reduces their size on disk. However, this also means that everytime we use this flag, we have to make sure that we also: - Include `metadata` in the `--emit` flag to generate a `.rmeta` file, otherwise no metadata would be generated whatsoever, which would mean that the artifact wouldn't be usable as a dependency. - Pass also `--extern <dep>=<path>.rmeta` when compiling the final linkable artifact. Before, Cargo would only pass `--extern <dep>=<path>.[rlib|so|dll]`. Since with `-Zembed-metadata=no`, the metadata is only in the `.rmeta` file and not in the rlib/dylib, this is needed to help rustc find out where the metadata lies. - Note: this essentially doubles the cmdline length when compiling the final linked artifact. Not sure if that is a concern. The two points above is what this PR implements, and why this rustc flag needs Cargo integration. The `-Zembed-metadata` flag is only passed to libs, rlibs and dylibs. It does not seem to make sense for other crate types. The one situation where it might make sense are proc macros, but according to @bjorn3 (who initially came up with the idea for `-Zembed-metadata`, it isn't really worth it). Here is a table that summarizes the changes in passed flags and generated files on disk for rlibs and dylibs: | **Crate type** | **Flags** | **Generated files** | **Disk usage** | |--|--|--|--| | Rlib/Lib (before) | `--emit=dep-info,metadata,link` | `.rlib` (with metadata), `.rmeta` (for pipelining) | - | | Rlib/Lib (after) | `--emit=dep-info,metadata,link -Zembed-metadata=no` | `.rlib` (without metadata), `.rmeta` (for metadata/pipelining) | Reduced (metadata no longer duplicated) | | Dylib (before) | `--emit=dep-info,link` | `[.so\|.dll]` (with metadata) | - | | Dylib (after) | `--emit=dep-info,metadata,link -Zembed-metadata=no` | `[.so\|.dll]` (without metadata), `.rmeta` | Unchanged, but split between two files | Behavior for other target kinds/crate types should be unchanged. From the table above, we can see two benefits of using `-Zembed-metadata=no`: - For rlibs/dylibs, we no longer store their metadata twice in the target directory, thus reducing target directory size. - For dylibs, we store esssentially the same amount of data on disk, but the benefit is that the metadata is now in a separate .rmeta file. This means that you can ship the dylib (`.so`/`.dll`) to users without also shipping the metadata. This would slightly reduce e.g. the [size](rust-lang/rust#120855 (comment)) of the shipped rustc toolchains (note that the size reduction here is after the toolchain has been already heavily compressed). Note that if this behavior ever becomes the default, it should be possible to simplify the code quite a bit, and essentially merge the `requires_upstream_objects` and `benefits_from_split_metadata` functions. I did a very simple initial benchmark to evaluate the space savings on cargo itself and [hyperqueue](https://github.com/It4innovations/hyperqueue) (a mid-size crate from my work) using `cargo build` and `cargo build --release` with and without `-Zembed-metadata=no`: ![image](https://github.com/user-attachments/assets/a26994a2-156f-4863-a823-1042ebe03bf0) For debug/incremental builds, the effect is smaller, as the artifact disk usage is dwarfed by incremental artifacts and debuginfo. But for (non-incremental) release builds, the disk savings (and also performed I/O operations) are significantly reduced. ### How should we test and review this PR? I wrote two basic tests. The second one tests a situation where a crate depends on a dylib dependency, which is quite rare, but the behavior of this has actually changed in this PR (see comparison table above). Testing this on various real-world projects (or even trying to enable it by default across the whole Cargo suite?) might be beneficial. ## Unresolved questions ### Is this a breaking change? With this new behavior, dylibs and rlibs will no longer contain metadata. If they are compiled with Cargo, that shouldn't matter, but other build systems might have to adapt. ### Should this become the default? I think that in terms of disk size usage and performed I/O operations, it is a pure win. It should either generate less disk data (for rlibs) or the ~same amount of data for dylibs (the data will be a bit larger, because the dylib will still contain a metadata stub header, but that's like 50 bytes and doesn't scale with the size of the dylib, so it's negligible). So I think that eventually, we should just do this by default in Cargo, unless some concerns are found. I suppose that before stabilizing we should also benchmark the effect on compilation performance.
2 parents 79dfdd3 + 329fa50 commit 7bdd872

File tree

11 files changed

+232
-34
lines changed

11 files changed

+232
-34
lines changed

src/cargo/core/compiler/build_context/target_info.rs

+15-4
Original file line numberDiff line numberDiff line change
@@ -570,9 +570,10 @@ impl TargetInfo {
570570
mode: CompileMode,
571571
target_kind: &TargetKind,
572572
target_triple: &str,
573+
gctx: &GlobalContext,
573574
) -> CargoResult<(Vec<FileType>, Vec<CrateType>)> {
574575
match mode {
575-
CompileMode::Build => self.calc_rustc_outputs(target_kind, target_triple),
576+
CompileMode::Build => self.calc_rustc_outputs(target_kind, target_triple, gctx),
576577
CompileMode::Test | CompileMode::Bench => {
577578
match self.file_types(&CrateType::Bin, FileFlavor::Normal, target_triple)? {
578579
Some(fts) => Ok((fts, Vec::new())),
@@ -593,6 +594,7 @@ impl TargetInfo {
593594
&self,
594595
target_kind: &TargetKind,
595596
target_triple: &str,
597+
gctx: &GlobalContext,
596598
) -> CargoResult<(Vec<FileType>, Vec<CrateType>)> {
597599
let mut unsupported = Vec::new();
598600
let mut result = Vec::new();
@@ -613,9 +615,18 @@ impl TargetInfo {
613615
}
614616
}
615617
}
616-
if !result.is_empty() && !crate_types.iter().any(|ct| ct.requires_upstream_objects()) {
617-
// Only add rmeta if pipelining.
618-
result.push(FileType::new_rmeta());
618+
if !result.is_empty() {
619+
if gctx.cli_unstable().no_embed_metadata
620+
&& crate_types
621+
.iter()
622+
.any(|ct| ct.benefits_from_no_embed_metadata())
623+
{
624+
// Add .rmeta when we apply -Zembed-metadata=no to the unit.
625+
result.push(FileType::new_rmeta());
626+
} else if !crate_types.iter().any(|ct| ct.requires_upstream_objects()) {
627+
// Only add rmeta if pipelining
628+
result.push(FileType::new_rmeta());
629+
}
619630
}
620631
Ok((result, unsupported))
621632
}

src/cargo/core/compiler/build_runner/compilation_files.rs

+2-1
Original file line numberDiff line numberDiff line change
@@ -363,6 +363,7 @@ impl<'a, 'gctx: 'a> CompilationFiles<'a, 'gctx> {
363363
CompileMode::Build,
364364
&TargetKind::Bin,
365365
bcx.target_data.short_name(&kind),
366+
bcx.gctx,
366367
)
367368
.expect("target must support `bin`");
368369

@@ -540,7 +541,7 @@ impl<'a, 'gctx: 'a> CompilationFiles<'a, 'gctx> {
540541
let info = bcx.target_data.info(unit.kind);
541542
let triple = bcx.target_data.short_name(&unit.kind);
542543
let (file_types, unsupported) =
543-
info.rustc_outputs(unit.mode, unit.target.kind(), triple)?;
544+
info.rustc_outputs(unit.mode, unit.target.kind(), triple, bcx.gctx)?;
544545
if file_types.is_empty() {
545546
if !unsupported.is_empty() {
546547
let unsupported_strs: Vec<_> = unsupported.iter().map(|ct| ct.as_str()).collect();

src/cargo/core/compiler/crate_type.rs

+32
Original file line numberDiff line numberDiff line change
@@ -76,6 +76,38 @@ impl CrateType {
7676
// Everything else, however, is some form of "linkable output" or
7777
// something that requires upstream object files.
7878
}
79+
80+
/// Returns whether production of this crate type could benefit from splitting metadata
81+
/// into a .rmeta file.
82+
///
83+
/// See also [`TargetKind::benefits_from_no_embed_metadata`].
84+
///
85+
/// [`TargetKind::benefits_from_no_embed_metadata`]: crate::core::manifest::TargetKind::benefits_from_no_embed_metadata
86+
pub fn benefits_from_no_embed_metadata(&self) -> bool {
87+
match self {
88+
// rlib/libs generate .rmeta files for pipelined compilation.
89+
// If we also include metadata inside of them, we waste disk space, since the metadata
90+
// will be located both in the lib/rlib and the .rmeta file.
91+
CrateType::Lib |
92+
CrateType::Rlib |
93+
// Dylibs do not have to contain metadata when they are used as a runtime dependency.
94+
// If we split the metadata into a separate .rmeta file, the dylib file (that
95+
// can be shipped as a runtime dependency) can be smaller.
96+
CrateType::Dylib => true,
97+
// Proc macros contain metadata that specifies what macro functions are available in
98+
// it, but the metadata is typically very small. The metadata of proc macros is also
99+
// self-contained (unlike rlibs/dylibs), so let's not unnecessarily split it into
100+
// multiple files.
101+
CrateType::ProcMacro |
102+
// cdylib and staticlib produce artifacts that are used through the C ABI and do not
103+
// contain Rust-specific metadata.
104+
CrateType::Cdylib |
105+
CrateType::Staticlib |
106+
// Binaries also do not contain metadata
107+
CrateType::Bin |
108+
CrateType::Other(_) => false
109+
}
110+
}
79111
}
80112

81113
impl fmt::Display for CrateType {

src/cargo/core/compiler/mod.rs

+32-6
Original file line numberDiff line numberDiff line change
@@ -1132,13 +1132,31 @@ fn build_base_args(
11321132

11331133
if unit.mode.is_check() {
11341134
cmd.arg("--emit=dep-info,metadata");
1135-
} else if !unit.requires_upstream_objects() {
1136-
// Always produce metadata files for rlib outputs. Metadata may be used
1137-
// in this session for a pipelined compilation, or it may be used in a
1138-
// future Cargo session as part of a pipelined compile.
1139-
cmd.arg("--emit=dep-info,metadata,link");
1135+
} else if build_runner.bcx.gctx.cli_unstable().no_embed_metadata {
1136+
// Nightly rustc supports the -Zembed-metadata=no flag, which tells it to avoid including
1137+
// full metadata in rlib/dylib artifacts, to save space on disk. In this case, metadata
1138+
// will only be stored in .rmeta files.
1139+
// When we use this flag, we should also pass --emit=metadata to all artifacts that
1140+
// contain useful metadata (rlib/dylib/proc macros), so that a .rmeta file is actually
1141+
// generated. If we didn't do this, the full metadata would not get written anywhere.
1142+
// However, we do not want to pass --emit=metadata to artifacts that never produce useful
1143+
// metadata, such as binaries, because that would just unnecessarily create empty .rmeta
1144+
// files on disk.
1145+
if unit.benefits_from_no_embed_metadata() {
1146+
cmd.arg("--emit=dep-info,metadata,link");
1147+
cmd.args(&["-Z", "embed-metadata=no"]);
1148+
} else {
1149+
cmd.arg("--emit=dep-info,link");
1150+
}
11401151
} else {
1141-
cmd.arg("--emit=dep-info,link");
1152+
// If we don't use -Zembed-metadata=no, we emit .rmeta files only for rlib outputs.
1153+
// This metadata may be used in this session for a pipelined compilation, or it may
1154+
// be used in a future Cargo session as part of a pipelined compile.
1155+
if !unit.requires_upstream_objects() {
1156+
cmd.arg("--emit=dep-info,metadata,link");
1157+
} else {
1158+
cmd.arg("--emit=dep-info,link");
1159+
}
11421160
}
11431161

11441162
let prefer_dynamic = (unit.target.for_host() && !unit.target.is_custom_build())
@@ -1637,6 +1655,8 @@ pub fn extern_args(
16371655
let mut result = Vec::new();
16381656
let deps = build_runner.unit_deps(unit);
16391657

1658+
let no_embed_metadata = build_runner.bcx.gctx.cli_unstable().no_embed_metadata;
1659+
16401660
// Closure to add one dependency to `result`.
16411661
let mut link_to =
16421662
|dep: &UnitDep, extern_crate_name: InternedString, noprelude: bool| -> CargoResult<()> {
@@ -1686,6 +1706,12 @@ pub fn extern_args(
16861706
if output.flavor == FileFlavor::Linkable {
16871707
pass(&output.path);
16881708
}
1709+
// If we use -Zembed-metadata=no, we also need to pass the path to the
1710+
// corresponding .rmeta file to the linkable artifact, because the
1711+
// normal dependency (rlib) doesn't contain the full metadata.
1712+
else if no_embed_metadata && output.flavor == FileFlavor::Rmeta {
1713+
pass(&output.path);
1714+
}
16891715
}
16901716
}
16911717
Ok(())

src/cargo/core/compiler/unit.rs

+7
Original file line numberDiff line numberDiff line change
@@ -125,6 +125,13 @@ impl UnitInner {
125125
self.mode.is_any_test() || self.target.kind().requires_upstream_objects()
126126
}
127127

128+
/// Returns whether compilation of this unit could benefit from splitting metadata
129+
/// into a .rmeta file.
130+
pub fn benefits_from_no_embed_metadata(&self) -> bool {
131+
matches!(self.mode, CompileMode::Build)
132+
&& self.target.kind().benefits_from_no_embed_metadata()
133+
}
134+
128135
/// Returns whether or not this is a "local" package.
129136
///
130137
/// A "local" package is one that the user can likely edit, or otherwise

src/cargo/core/features.rs

+2
Original file line numberDiff line numberDiff line change
@@ -783,6 +783,7 @@ unstable_cli_options!(
783783
msrv_policy: bool = ("Enable rust-version aware policy within cargo"),
784784
mtime_on_use: bool = ("Configure Cargo to update the mtime of used files"),
785785
next_lockfile_bump: bool,
786+
no_embed_metadata: bool = ("Avoid embedding metadata in library artifacts"),
786787
no_index_update: bool = ("Do not update the registry index even if the cache is outdated"),
787788
package_workspace: bool = ("Handle intra-workspace dependencies when packaging"),
788789
panic_abort_tests: bool = ("Enable support to run tests with -Cpanic=abort"),
@@ -1294,6 +1295,7 @@ impl CliUnstable {
12941295
"msrv-policy" => self.msrv_policy = parse_empty(k, v)?,
12951296
// can also be set in .cargo/config or with and ENV
12961297
"mtime-on-use" => self.mtime_on_use = parse_empty(k, v)?,
1298+
"no-embed-metadata" => self.no_embed_metadata = parse_empty(k, v)?,
12971299
"no-index-update" => self.no_index_update = parse_empty(k, v)?,
12981300
"package-workspace" => self.package_workspace = parse_empty(k, v)?,
12991301
"panic-abort-tests" => self.panic_abort_tests = parse_empty(k, v)?,

src/cargo/core/manifest.rs

+11
Original file line numberDiff line numberDiff line change
@@ -279,6 +279,17 @@ impl TargetKind {
279279
}
280280
}
281281

282+
/// Returns whether production of this artifact could benefit from splitting metadata
283+
/// into a .rmeta file.
284+
pub fn benefits_from_no_embed_metadata(&self) -> bool {
285+
match self {
286+
TargetKind::Lib(kinds) | TargetKind::ExampleLib(kinds) => {
287+
kinds.iter().any(|k| k.benefits_from_no_embed_metadata())
288+
}
289+
_ => false,
290+
}
291+
}
292+
282293
/// Returns the arguments suitable for `--crate-type` to pass to rustc.
283294
pub fn rustc_crate_types(&self) -> Vec<CrateType> {
284295
match self {

src/cargo/ops/cargo_clean.rs

+1-1
Original file line numberDiff line numberDiff line change
@@ -230,7 +230,7 @@ fn clean_specs(
230230

231231
let (file_types, _unsupported) = target_data
232232
.info(*compile_kind)
233-
.rustc_outputs(mode, target.kind(), triple)?;
233+
.rustc_outputs(mode, target.kind(), triple, clean_ctx.gctx)?;
234234
let (dir, uplift_dir) = match target.kind() {
235235
TargetKind::ExampleBin | TargetKind::ExampleLib(..) => {
236236
(layout.build_examples(), Some(layout.examples()))

src/doc/src/reference/unstable.md

+18
Original file line numberDiff line numberDiff line change
@@ -91,6 +91,7 @@ Each new feature described below should explain how to use it.
9191
* [checksum-freshness](#checksum-freshness) --- When passed, the decision as to whether a crate needs to be rebuilt is made using file checksums instead of the file mtime.
9292
* [panic-abort-tests](#panic-abort-tests) --- Allows running tests with the "abort" panic strategy.
9393
* [host-config](#host-config) --- Allows setting `[target]`-like configuration settings for host build targets.
94+
* [no-embed-metadata](#no-embed-metadata) --- Passes `-Zembed-metadata=no` to the compiler, which avoid embedding metadata into rlib and dylib artifacts, to save disk space.
9495
* [target-applies-to-host](#target-applies-to-host) --- Alters whether certain flags will be passed to host build targets.
9596
* [gc](#gc) --- Global cache garbage collection.
9697
* [open-namespaces](#open-namespaces) --- Allow multiple packages to participate in the same API namespace
@@ -1896,6 +1897,23 @@ The `-Z rustdoc-depinfo` flag leverages rustdoc's dep-info files to determine
18961897
whether documentations are required to re-generate. This can be combined with
18971898
`-Z checksum-freshness` to detect checksum changes rather than file mtime.
18981899

1900+
## no-embed-metadata
1901+
* Original Pull Request: [#15378](https://github.com/rust-lang/cargo/pull/15378)
1902+
* Tracking Issue: [#15495](https://github.com/rust-lang/cargo/issues/15495)
1903+
1904+
The default behavior of Rust is to embed crate metadata into `rlib` and `dylib` artifacts.
1905+
Since Cargo also passes `--emit=metadata` to these intermediate artifacts to enable pipelined
1906+
compilation, this means that a lot of metadata ends up being duplicated on disk, which wastes
1907+
disk space in the target directory.
1908+
1909+
This feature tells Cargo to pass the `-Zembed-metadata=no` flag to the compiler, which instructs
1910+
it not to embed metadata within rlib and dylib artifacts. In this case, the metadata will only
1911+
be stored in `.rmeta` files.
1912+
1913+
```console
1914+
cargo +nightly -Zno-embed-metadata build
1915+
```
1916+
18991917
# Stabilized and removed features
19001918

19011919
## Compile progress

tests/testsuite/build.rs

+88
Original file line numberDiff line numberDiff line change
@@ -6749,3 +6749,91 @@ fn renamed_uplifted_artifact_remains_unmodified_after_rebuild() {
67496749
let not_the_same = !same_file::is_same_file(bin, renamed_bin).unwrap();
67506750
assert!(not_the_same, "renamed uplifted artifact must be unmodified");
67516751
}
6752+
6753+
#[cargo_test(nightly, reason = "-Zembed-metadata is nightly only")]
6754+
fn embed_metadata() {
6755+
let p = project()
6756+
.file(
6757+
"Cargo.toml",
6758+
r#"
6759+
[package]
6760+
6761+
name = "foo"
6762+
version = "0.5.0"
6763+
edition = "2015"
6764+
6765+
[dependencies.bar]
6766+
path = "bar"
6767+
"#,
6768+
)
6769+
.file("src/main.rs", &main_file(r#""{}", bar::gimme()"#, &[]))
6770+
.file("bar/Cargo.toml", &basic_lib_manifest("bar"))
6771+
.file(
6772+
"bar/src/bar.rs",
6773+
r#"
6774+
pub fn gimme() -> &'static str {
6775+
"test passed"
6776+
}
6777+
"#,
6778+
)
6779+
.build();
6780+
6781+
p.cargo("build -Z no-embed-metadata")
6782+
.masquerade_as_nightly_cargo(&["-Z no-embed-metadata"])
6783+
.arg("-v")
6784+
.with_stderr_contains("[RUNNING] `[..]-Z embed-metadata=no[..]`")
6785+
.with_stderr_contains(
6786+
"[RUNNING] `[..]--extern bar=[ROOT]/foo/target/debug/deps/libbar-[HASH].rmeta[..]`",
6787+
)
6788+
.run();
6789+
}
6790+
6791+
// Make sure that cargo passes --extern=<dep>.rmeta even if <dep>
6792+
// is compiled as a dylib.
6793+
#[cargo_test(nightly, reason = "-Zembed-metadata is nightly only")]
6794+
fn embed_metadata_dylib_dep() {
6795+
let p = project()
6796+
.file(
6797+
"Cargo.toml",
6798+
r#"
6799+
[package]
6800+
name = "foo"
6801+
version = "0.5.0"
6802+
edition = "2015"
6803+
6804+
[dependencies.bar]
6805+
path = "bar"
6806+
"#,
6807+
)
6808+
.file("src/main.rs", &main_file(r#""{}", bar::gimme()"#, &[]))
6809+
.file(
6810+
"bar/Cargo.toml",
6811+
r#"
6812+
[package]
6813+
name = "bar"
6814+
version = "0.5.0"
6815+
edition = "2015"
6816+
6817+
[lib]
6818+
crate-type = ["dylib"]
6819+
"#,
6820+
)
6821+
.file(
6822+
"bar/src/lib.rs",
6823+
r#"
6824+
pub fn gimme() -> &'static str {
6825+
"test passed"
6826+
}
6827+
"#,
6828+
)
6829+
.build();
6830+
6831+
p.cargo("build -Z no-embed-metadata")
6832+
.masquerade_as_nightly_cargo(&["-Z no-embed-metadata"])
6833+
.arg("-v")
6834+
.with_stderr_contains("[RUNNING] `[..]-Z embed-metadata=no[..]`")
6835+
.with_stderr_contains(
6836+
"[RUNNING] `[..]--extern bar=[ROOT]/foo/target/debug/deps/libbar.rmeta[..]`",
6837+
)
6838+
.run();
6839+
}

0 commit comments

Comments
 (0)