Skip to content

Add PGO Applicability Feature (Prototype) #5193

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 27 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
27 commits
Select commit Hold shift + click to select a range
a7f6494
tools/devtool: add PGO helper scripting to devtool
valeriepieger May 3, 2025
69e3eae
docs/pgo-getting-started.md: added a getting started guide for PGO
valeriepieger Apr 27, 2025
41864da
feat: Add PVTime support for ARM
DakshinD Apr 24, 2025
10429dc
test: Add steal time integration tests
DakshinD Apr 24, 2025
a2ae6bd
doc: Add changelog entry for PVTime
DakshinD Apr 24, 2025
83b3e77
build(deps): Bump the firecracker group with 12 updates
dependabot[bot] Apr 28, 2025
e8a6d6a
test(signal): remove flaky unit test
Manciukic May 1, 2025
1ebf723
test(jailer): use tmp dir for mknod test
Manciukic Apr 29, 2025
ead75b7
refactor(test/cgroup): use a TempDir instead of a manually created dir
Manciukic May 1, 2025
377b7fa
test(net): add retry in test_tap_offload
Manciukic Apr 29, 2025
482b2ec
test(balloon): ensure we gave stats enough time to update before reading
Manciukic Apr 29, 2025
421b155
fix(test/pvtime): bump tolerance to 10s
Manciukic Apr 30, 2025
080c876
fix(test_reboot): do not assert thread count
Manciukic Apr 30, 2025
ee6a2cc
feat(tests): add systemd_analyze data as boot time metric
ShadowCurse Apr 29, 2025
7d164e6
refactor(tests): update boot time device metrics
ShadowCurse Apr 29, 2025
65b8b33
feat(tests): add resume metric to boot time tests
ShadowCurse Apr 29, 2025
949ac4c
chore: ignore boot test resume_time metric in A/B
ShadowCurse Apr 30, 2025
783ca67
doc: fix arguments to ab_test.py in README
roypat May 2, 2025
29517f6
fix(ab): only reduce dimension set for printing
roypat May 2, 2025
3a990f2
ab: ignore some block throughput metrics on m8g
roypat May 2, 2025
7579beb
build(deps): Bump the firecracker group with 9 updates
dependabot[bot] May 5, 2025
61a93c5
allow specifying custom cpu template inline in config json
roypat Apr 29, 2025
3582c78
chore: Update CHANGELOG in preparation of 1.12.0 release
zulinx86 May 6, 2025
85099fe
chore: Update release status in preparation of v1.12.0 release
zulinx86 May 6, 2025
3827acd
chore: Add CHANGELOG entries for newly supported instance types
zulinx86 May 6, 2025
bacff13
chore: Bump Rust dependencies
zulinx86 May 6, 2025
06b7ece
chore: Bump version to 1.13.0-dev
zulinx86 May 6, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
213 changes: 213 additions & 0 deletions docs/pgo-getting-started.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,213 @@
# Profile-Guided Optimization (PGO) for Firecracker

This document provides a guide for building Firecracker using Profile-Guided
Optimization (PGO) in an isolated manner.

PGO can help improve performance by using runtime profiling data to guide
compiler optimizations. This process is fully **optional** and does **not**
affect the default Firecracker build system.

## Overview

PGO allows the Rust compiler (via LLVM) to use runtime profiling data to
optimize the generated binary for actual workloads. This generally results in
performance improvements for CPU-bound applications like Firecracker.

We generate .profraw files at runtime and merge them into .profddata files to
then be reused in an optimized build of Firecracker.

The PGO build process involves three main phases:

1. **Instrumentation**: Build Firecracker with instrumentation to collect
profiling data.
1. **Profiling**: Run realistic workloads to generate `.profraw` profiling
files.
1. **Optimize**: Rebuild Firecracker using the collected profiling data for
improved performance.

## 1. Build with Instrumentation

Build Firecracker with profiling instrumentation enabled. If starting in the
`firecracker` directory, the command will be:

```
./tools/devtool pgo_build instrument
```

This produces a binary that, when executed, generates `.profraw` files
containing runtime behavior data.

______________________________________________________________________

\*\* Note: the ideal environment for PGO is the same as the ideal environment
for firecracker: x86_64 architecture, Ubuntu OS (24.04 currently), and bare
metal (so that /dev/kvm is exposed). However, this step specifically can be done
on non-x86_64 machines with
`RUSTFLAGS="-Cprofile-generate=/tmp/firecracker-profdata" cargo build --release --package firecracker`.

### Common Issue: Failed to run custom build command for `cpu-template-helper`

Try: Ensuring the build directory exists and is writable with:

```
mkdir -p src/cpu-template-helper/build
chmod -R u+rw src/cpu-template-helper/build
```

Also ensure all dependencies (e.g., aws-lc-sys, userfaultfd-sys) can be built by
running:

```
cargo clean
cargo build --release
```

### Common Issue: failed to run custom build command for userfaultfd-sys v0.5.0

Try: `sudo apt install libclang-dev clang pkg-config`

### Common Issue: failed to run custom build command for aws-lc-sys v0.28.1

Try: `sudo apt install cmake ninja-build perl`

### Common Issue: a bunch of errors like..

```
OUTPUT: Failed to compile memcmp_invalid_stripped_check
note: run with RUST_BACKTRACE=1 environment variable to display a backtrace
warning: build failed, waiting for other jobs to finish...
```

You might have an issue with global include-path overrides.

## 2. Profiling

Run realistic workloads to generate these `.profraw` files. Here are some
examples of typical workloads:

- Boot a microVM
- Simulate network activity on a microVM
- Simulate basic I/O on a microVM

Try to touch all major systems you personally care about optimizing so that you
can benchmark it against the base build later.

Here's an example process of booting a minimal microVM:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we generate all the binaries with pgo profile information enabled in a folder (build/cargo_target/${toolchain}/pgo-instrument), we could then run the devtool tests directly passing the --binary-dir option.

However, this will introduce the problem of extracting the profiling information from the container


1. Download a test kernel and rootfs.
1. Start Firecracker
1. Use curl to configure in another terminal. E.g.,

```
# Configure boot source
curl --unix-socket $API_SOCKET -i \
-X PUT 'http://localhost/boot-source' \
-H 'Content-Type: application/json' \
-d '{
"kernel_image_path": "vmlinux.bin",
"boot_args": "console=ttyS0 reboot=k panic=1 pci=off"
}'

# Configure rootfs
curl --unix-socket $API_SOCKET -i \
-X PUT 'http://localhost/drives/rootfs' \
-H 'Content-Type: application/json' \
-d '{
"drive_id": "rootfs",
"path_on_host": "rootfs.ext4",
"is_root_device": true,
"is_read_only": false
}'

# (Optional) set machine config if you want custom vCPU/RAM:
curl --unix-socket $API_SOCKET -i \
-X PUT 'http://localhost/machine-config' \
-H 'Content-Type: application/json' \
-d '{
"vcpu_count": 1,
"mem_size_mib": 128
}'

# Start the VM
curl --unix-socket $API_SOCKET -i \
-X PUT 'http://localhost/actions' \
-H 'Content-Type: application/json' \
-d '{"action_type":"InstanceStart"}'
```

Please refer to the Firecracker getting started guide
[(link here)](https://github.com/firecracker-microvm/firecracker/blob/main/docs/getting-started.md)
for a more in-depth look at how to do this.

## 3. Optimize

After running your desired workloads, the resulting `.profraw` files can be seen
with:

```
ls /tmp/firecracker-profdata/
```

______________________________________________________________________

#### Merging

To merge these files into valid profile data use:

```
./tools/devtool pgo_build merge
```

#### Common Issue: version mismatch

This will look something like: “raw profile format version = 10; expected
version = 9”

This is common and might even happen on an ideal environment due to the Rust
toolchain producing v10 profile but Ubuntu 24.04 packages not shipping an
llvm-profdata that works for v10. You may be able to install the matching LLVM
on your host, but if it gives you trouble, using Rust's nightly toolchain can
also work.

To use nightly, try:

```
rustup toolchain install nightly
rustup component add llvm-tools-preview --toolchain nightly

export PATH="$HOME/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/bin:$PATH"
```

______________________________________________________________________

#### Optimized Build

Once the `.profraw` files are merged into `.profdata`, you can re-build with the
merged profile:

```
./tools/devtool pgo_build optimize
```

Then, you can verify your optimized binary is in
`build/cargo_target/release/firecracker` and run it with

```
./build/cargo_target/release/firecracker --api-sock /tmp/fc.socket
```

### 4. Verify/Benchmark

Once you have this PGO build, you can run any of the repository's existing
performance tests to observe the speed-ups.
Comment on lines +201 to +202
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is missing instructions on how to run the tests on the newly generated binaries


### Community Benchmark Results

Please feel free to fork this repo, run your own benchmarks, and submit a PR
updating the table below.

| Machine (CPU/RAM) | Firecracker (non-PGO) | Firecracker (PGO) | Δ (PGO vs. baseline) | Notes |
| ----------------------------- | --------------------: | ----------------: | -------------------: | -------------------------------------------- |
| AMD Ryzen 7 7700X; 32 GiB RAM | 0.01275 | 0.01079 | -15.37% | Ubuntu 24.04; used test_boottime.py for both |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is this metric? which is the unit?
it'd be interesting to see the other performance tests as well.

| | | | | |
| | | | | |
2 changes: 0 additions & 2 deletions rootfs.ext4

This file was deleted.

6 changes: 3 additions & 3 deletions tools/devtool
Original file line number Diff line number Diff line change
Expand Up @@ -1104,20 +1104,20 @@ cmd_pgo_build() {
instrument)
echo "Building instrumented Firecracker binary"
RUSTFLAGS="-Cprofile-generate=/tmp/firecracker-profdata" cargo build --release
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these commands should be run inside the devctr to avoid the missing dependencies issues mentioned in the doc. Maybe we can extend cmd_build to add pgo-instrument and pgo-optimize profiles

echo "Instrumentation complete."
echo "Instrumentation complete."
;;
profile)
echo "Run workloads manually to generate .profraw files in /tmp/firecracker-profdata/"
echo "Please consult README-pgo.md for more information."
;;
;;
merge)
echo "Merging .profraw files"
if ! llvm-profdata merge -output=${PROFDATA_FILE} ${PROFDATA_DIR}/*.profraw; then
echo "Error: Failed to merge profile data."
echo " Make sure .profraw files exist and are readable."
exit 1
fi
echo "Merging complete."
echo "Merging complete."
;;
optimize)
echo "Building optimized Firecracker with profile data"
Expand Down