You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
### What does this PR try to resolve?
This PR adds Cargo integration for the new unstable `-Zembed-metadata`
rustc flag, which was implemented in
rust-lang/rust#137535 ([tracking
issue](rust-lang/rust#139165)). The new
behavior has to be enabled explicitly using a new unstable CLI flag
`-Zno-embed-metadata`.
The `-Zembed-metadata=no` rustc flag can reduce disk usage of compiled
artifacts, and also the size of Rust dynamic library artifacts shipped
to users. However, it is not enough to just pass this flag through
`RUSTFLAGS`; it needs to be integrated within Cargo, because it
interacts with how the `--emit` flag is passed to rustc, and also how
`--extern` args are passed to the final linked artifact build by Cargo.
Furthermore, using the flag for all crates in a crate graph compiled by
Cargo would be suboptimal (this will all be described below).
When you pass `-Zembed-metadata=no` to rustc, it will not store Rust
metadata into the compiled artifact. This is important when compiling
libs/rlibs/dylibs, since it reduces their size on disk. However, this
also means that everytime we use this flag, we have to make sure that we
also:
- Include `metadata` in the `--emit` flag to generate a `.rmeta` file,
otherwise no metadata would be generated whatsoever, which would mean
that the artifact wouldn't be usable as a dependency.
- Pass also `--extern <dep>=<path>.rmeta` when compiling the final
linkable artifact. Before, Cargo would only pass `--extern
<dep>=<path>.[rlib|so|dll]`. Since with `-Zembed-metadata=no`, the
metadata is only in the `.rmeta` file and not in the rlib/dylib, this is
needed to help rustc find out where the metadata lies.
- Note: this essentially doubles the cmdline length when compiling the
final linked artifact. Not sure if that is a concern.
The two points above is what this PR implements, and why this rustc flag
needs Cargo integration.
The `-Zembed-metadata` flag is only passed to libs, rlibs and dylibs. It
does not seem to make sense for other crate types. The one situation
where it might make sense are proc macros, but according to @bjorn3 (who
initially came up with the idea for `-Zembed-metadata`, it isn't really
worth it).
Here is a table that summarizes the changes in passed flags and
generated files on disk for rlibs and dylibs:
| **Crate type** | **Flags** | **Generated files** | **Disk usage** |
|--|--|--|--|
| Rlib/Lib (before) | `--emit=dep-info,metadata,link` | `.rlib` (with
metadata), `.rmeta` (for pipelining) | - |
| Rlib/Lib (after) | `--emit=dep-info,metadata,link -Zembed-metadata=no`
| `.rlib` (without metadata), `.rmeta` (for metadata/pipelining) |
Reduced (metadata no longer duplicated) |
| Dylib (before) | `--emit=dep-info,link` | `[.so\|.dll]` (with
metadata) | - |
| Dylib (after) | `--emit=dep-info,metadata,link -Zembed-metadata=no` |
`[.so\|.dll]` (without metadata), `.rmeta` | Unchanged, but split
between two files |
Behavior for other target kinds/crate types should be unchanged.
From the table above, we can see two benefits of using
`-Zembed-metadata=no`:
- For rlibs/dylibs, we no longer store their metadata twice in the
target directory, thus reducing target directory size.
- For dylibs, we store esssentially the same amount of data on disk, but
the benefit is that the metadata is now in a separate .rmeta file. This
means that you can ship the dylib (`.so`/`.dll`) to users without also
shipping the metadata. This would slightly reduce e.g. the
[size](rust-lang/rust#120855 (comment))
of the shipped rustc toolchains (note that the size reduction here is
after the toolchain has been already heavily compressed).
Note that if this behavior ever becomes the default, it should be
possible to simplify the code quite a bit, and essentially merge the
`requires_upstream_objects` and `benefits_from_split_metadata`
functions.
I did a very simple initial benchmark to evaluate the space savings on
cargo itself and
[hyperqueue](https://github.com/It4innovations/hyperqueue) (a mid-size
crate from my work) using `cargo build` and `cargo build --release` with
and without `-Zembed-metadata=no`:

For debug/incremental builds, the effect is smaller, as the artifact
disk usage is dwarfed by incremental artifacts and debuginfo. But for
(non-incremental) release builds, the disk savings (and also performed
I/O operations) are significantly reduced.
### How should we test and review this PR?
I wrote two basic tests. The second one tests a situation where a crate
depends on a dylib dependency, which is quite rare, but the behavior of
this has actually changed in this PR (see comparison table above).
Testing this on various real-world projects (or even trying to enable it
by default across the whole Cargo suite?) might be beneficial.
## Unresolved questions
### Is this a breaking change?
With this new behavior, dylibs and rlibs will no longer contain
metadata. If they are compiled with Cargo, that shouldn't matter, but
other build systems might have to adapt.
### Should this become the default?
I think that in terms of disk size usage and performed I/O operations,
it is a pure win. It should either generate less disk data (for rlibs)
or the ~same amount of data for dylibs (the data will be a bit larger,
because the dylib will still contain a metadata stub header, but that's
like 50 bytes and doesn't scale with the size of the dylib, so it's
negligible).
So I think that eventually, we should just do this by default in Cargo,
unless some concerns are found. I suppose that before stabilizing we
should also benchmark the effect on compilation performance.
Copy file name to clipboardExpand all lines: src/doc/src/reference/unstable.md
+18
Original file line number
Diff line number
Diff line change
@@ -91,6 +91,7 @@ Each new feature described below should explain how to use it.
91
91
*[checksum-freshness](#checksum-freshness) --- When passed, the decision as to whether a crate needs to be rebuilt is made using file checksums instead of the file mtime.
92
92
*[panic-abort-tests](#panic-abort-tests) --- Allows running tests with the "abort" panic strategy.
*[no-embed-metadata](#no-embed-metadata) --- Passes `-Zembed-metadata=no` to the compiler, which avoid embedding metadata into rlib and dylib artifacts, to save disk space.
94
95
*[target-applies-to-host](#target-applies-to-host) --- Alters whether certain flags will be passed to host build targets.
95
96
*[gc](#gc) --- Global cache garbage collection.
96
97
*[open-namespaces](#open-namespaces) --- Allow multiple packages to participate in the same API namespace
@@ -1896,6 +1897,23 @@ The `-Z rustdoc-depinfo` flag leverages rustdoc's dep-info files to determine
1896
1897
whether documentations are required to re-generate. This can be combined with
1897
1898
`-Z checksum-freshness` to detect checksum changes rather than file mtime.
1898
1899
1900
+
## no-embed-metadata
1901
+
* Original Pull Request: [#15378](https://github.com/rust-lang/cargo/pull/15378)
0 commit comments