|
| 1 | +# Profile-Guided Optimization (PGO) for Firecracker |
| 2 | + |
| 3 | +This document provides a guide for building Firecracker using Profile-Guided |
| 4 | +Optimization (PGO) in an isolated manner. |
| 5 | + |
| 6 | +PGO can help improve performance by using runtime profiling data to guide |
| 7 | +compiler optimizations. This process is fully **optional** and does **not** |
| 8 | +affect the default Firecracker build system. |
| 9 | + |
| 10 | +## Overview |
| 11 | + |
| 12 | +PGO allows the Rust compiler (via LLVM) to use runtime profiling data to |
| 13 | +optimize the generated binary for actual workloads. This generally results in |
| 14 | +performance improvements for CPU-bound applications like Firecracker. |
| 15 | + |
| 16 | +We generate .profraw files at runtime and merge them into .profddata files to |
| 17 | +then be reused in an optimized build of Firecracker. |
| 18 | + |
| 19 | +The PGO build process involves three main phases: |
| 20 | + |
| 21 | +1. **Instrumentation**: Build Firecracker with instrumentation to collect |
| 22 | + profiling data. |
| 23 | +1. **Profiling**: Run realistic workloads to generate `.profraw` profiling |
| 24 | + files. |
| 25 | +1. **Optimize**: Rebuild Firecracker using the collected profiling data for |
| 26 | + improved performance. |
| 27 | + |
| 28 | +## 1. Build with Instrumentation |
| 29 | + |
| 30 | +Build Firecracker with profiling instrumentation enabled. If starting in the |
| 31 | +`firecracker` directory, the command will be: |
| 32 | + |
| 33 | +``` |
| 34 | +./tools/devtool pgo_build instrument |
| 35 | +``` |
| 36 | + |
| 37 | +This produces a binary that, when executed, generates `.profraw` files |
| 38 | +containing runtime behavior data. |
| 39 | + |
| 40 | +______________________________________________________________________ |
| 41 | + |
| 42 | +\*\* Note: the ideal environment for PGO is the same as the ideal environment |
| 43 | +for firecracker: x86_64 architecture, Ubuntu OS (24.04 currently), and bare |
| 44 | +metal (so that /dev/kvm is exposed). However, this step specifically can be done |
| 45 | +on non-x86_64 machines with |
| 46 | +`RUSTFLAGS="-Cprofile-generate=/tmp/firecracker-profdata" cargo build --release --package firecracker`. |
| 47 | + |
| 48 | +### Common Issue: Failed to run custom build command for `cpu-template-helper` |
| 49 | + |
| 50 | +Try: Ensuring the build directory exists and is writable with: |
| 51 | + |
| 52 | +``` |
| 53 | +mkdir -p src/cpu-template-helper/build |
| 54 | +chmod -R u+rw src/cpu-template-helper/build |
| 55 | +``` |
| 56 | + |
| 57 | +Also ensure all dependencies (e.g., aws-lc-sys, userfaultfd-sys) can be built by |
| 58 | +running: |
| 59 | + |
| 60 | +``` |
| 61 | +cargo clean |
| 62 | +cargo build --release |
| 63 | +``` |
| 64 | + |
| 65 | +### Common Issue: failed to run custom build command for userfaultfd-sys v0.5.0 |
| 66 | + |
| 67 | +Try: `sudo apt install libclang-dev clang pkg-config` |
| 68 | + |
| 69 | +### Common Issue: failed to run custom build command for aws-lc-sys v0.28.1 |
| 70 | + |
| 71 | +Try: `sudo apt install cmake ninja-build perl` |
| 72 | + |
| 73 | +### Common Issue: a bunch of errors like.. |
| 74 | + |
| 75 | +``` |
| 76 | +OUTPUT: Failed to compile memcmp_invalid_stripped_check |
| 77 | +note: run with RUST_BACKTRACE=1 environment variable to display a backtrace |
| 78 | +warning: build failed, waiting for other jobs to finish... |
| 79 | +``` |
| 80 | + |
| 81 | +You might have an issue with global include-path overrides. |
| 82 | + |
| 83 | +## 2. Profiling |
| 84 | + |
| 85 | +Run realistic workloads to generate these `.profraw` files. Here are some |
| 86 | +examples of typical workloads: |
| 87 | + |
| 88 | +- Boot a microVM |
| 89 | +- Simulate network activity on a microVM |
| 90 | +- Simulate basic I/O on a microVM |
| 91 | + |
| 92 | +Try to touch all major systems you personally care about optimizing so that you |
| 93 | +can benchmark it against the base build later. |
| 94 | + |
| 95 | +Here's an example process of booting a minimal microVM: |
| 96 | + |
| 97 | +1. Download a test kernel and rootfs. |
| 98 | +1. Start Firecracker |
| 99 | +1. Use curl to configure in another terminal. E.g., |
| 100 | + |
| 101 | +``` |
| 102 | +# Configure boot source |
| 103 | +curl --unix-socket $API_SOCKET -i \ |
| 104 | + -X PUT 'http://localhost/boot-source' \ |
| 105 | + -H 'Content-Type: application/json' \ |
| 106 | + -d '{ |
| 107 | + "kernel_image_path": "vmlinux.bin", |
| 108 | + "boot_args": "console=ttyS0 reboot=k panic=1 pci=off" |
| 109 | + }' |
| 110 | +
|
| 111 | +# Configure rootfs |
| 112 | +curl --unix-socket $API_SOCKET -i \ |
| 113 | + -X PUT 'http://localhost/drives/rootfs' \ |
| 114 | + -H 'Content-Type: application/json' \ |
| 115 | + -d '{ |
| 116 | + "drive_id": "rootfs", |
| 117 | + "path_on_host": "rootfs.ext4", |
| 118 | + "is_root_device": true, |
| 119 | + "is_read_only": false |
| 120 | + }' |
| 121 | +
|
| 122 | +# (Optional) set machine config if you want custom vCPU/RAM: |
| 123 | +curl --unix-socket $API_SOCKET -i \ |
| 124 | + -X PUT 'http://localhost/machine-config' \ |
| 125 | + -H 'Content-Type: application/json' \ |
| 126 | + -d '{ |
| 127 | + "vcpu_count": 1, |
| 128 | + "mem_size_mib": 128 |
| 129 | + }' |
| 130 | +
|
| 131 | +# Start the VM |
| 132 | +curl --unix-socket $API_SOCKET -i \ |
| 133 | + -X PUT 'http://localhost/actions' \ |
| 134 | + -H 'Content-Type: application/json' \ |
| 135 | + -d '{"action_type":"InstanceStart"}' |
| 136 | +``` |
| 137 | + |
| 138 | +Please refer to the Firecracker getting started guide |
| 139 | +[(link here)](https://github.com/firecracker-microvm/firecracker/blob/main/docs/getting-started.md) |
| 140 | +for a more in-depth look at how to do this. |
| 141 | + |
| 142 | +## 3. Optimize |
| 143 | + |
| 144 | +After running your desired workloads, the resulting `.profraw` files can be seen |
| 145 | +with: |
| 146 | + |
| 147 | +``` |
| 148 | +ls /tmp/firecracker-profdata/ |
| 149 | +``` |
| 150 | + |
| 151 | +______________________________________________________________________ |
| 152 | + |
| 153 | +#### Merging |
| 154 | + |
| 155 | +To merge these files into valid profile data use: |
| 156 | + |
| 157 | +``` |
| 158 | +./tools/devtool pgo_build merge |
| 159 | +``` |
| 160 | + |
| 161 | +#### Common Issue: version mismatch |
| 162 | + |
| 163 | +This will look something like: “raw profile format version = 10; expected |
| 164 | +version = 9” |
| 165 | + |
| 166 | +This is common and might even happen on an ideal environment due to the Rust |
| 167 | +toolchain producing v10 profile but Ubuntu 24.04 packages not shipping an |
| 168 | +llvm-profdata that works for v10. You may be able to install the matching LLVM |
| 169 | +on your host, but if it gives you trouble, using Rust's nightly toolchain can |
| 170 | +also work. |
| 171 | + |
| 172 | +To use nightly, try: |
| 173 | + |
| 174 | +``` |
| 175 | +rustup toolchain install nightly |
| 176 | +rustup component add llvm-tools-preview --toolchain nightly |
| 177 | +
|
| 178 | +export PATH="$HOME/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/bin:$PATH" |
| 179 | +``` |
| 180 | + |
| 181 | +______________________________________________________________________ |
| 182 | + |
| 183 | +#### Optimized Build |
| 184 | + |
| 185 | +Once the `.profraw` files are merged into `.profdata`, you can re-build with the |
| 186 | +merged profile: |
| 187 | + |
| 188 | +``` |
| 189 | +./tools/devtool pgo_build optimize |
| 190 | +``` |
| 191 | + |
| 192 | +Then, you can verify your optimized binary is in |
| 193 | +`build/cargo_target/release/firecracker` and run it with |
| 194 | + |
| 195 | +``` |
| 196 | +./build/cargo_target/release/firecracker --api-sock /tmp/fc.socket |
| 197 | +``` |
| 198 | + |
| 199 | +### 4. Verify/Benchmark |
| 200 | + |
| 201 | +Once you have this PGO build, you can run any of the repository's existing |
| 202 | +performance tests to observe the speed-ups. |
| 203 | + |
| 204 | +### Community Benchmark Results |
| 205 | + |
| 206 | +Please feel free to fork this repo, run your own benchmarks, and submit a PR |
| 207 | +updating the table below. |
| 208 | + |
| 209 | +| Machine (CPU/RAM) | Firecracker (non-PGO) | Firecracker (PGO) | Δ (PGO vs. baseline) | Notes | |
| 210 | +| ----------------------------- | --------------------: | ----------------: | -------------------: | -------------------------------------------- | |
| 211 | +| AMD Ryzen 7 7700X; 32 GiB RAM | 0.01275 | 0.01079 | -15.37% | Ubuntu 24.04; used test_boottime.py for both | |
| 212 | +| | | | | | |
| 213 | +| | | | | | |
0 commit comments