-
Notifications
You must be signed in to change notification settings - Fork 235
Description
Description
I've been using heapless's HistoryBuffer (now renamed to HistoryBuf) as a sliding search window with much success. More precisely, I have an iterator over a substantial number of bytes and need to find a particular sequence, so I'm doing something akin to the following:
let mut search_window: HistoryBuf<u8, 11> = HistoryBuf::new();
for byte in lots_of_bytes {
search_window.write(byte);
if search_window.oldest_ordered().eq(b"some-needle") {
// Needle found; do something useful
return;
}
}While I'm not using heapless in its intended embedded use case, I found its HistoryBuffer to be much faster than using a VecDeque with a manual pop_front()/push_back() or similar alternatives.
However, with heapless 0.9.1, there seems to be a significant regression in the performance of the above snippet, which I'd largely attribute to the write() call. Of course, the oldest_ordered().eq() in every iteration has a proportionately larger impact on runtime than the write() on its own, but just upgrading heapless to 0.9.1 with this snippet still results in about 400 MB/s less throughput in a real-world scenario (~1.4 → ~1.0 GB/s).
Different optimization settings like LTO or codegen units do not seem to affect the result much. You can find a Divan-based benchmark as an MRE below. This may very well be a case of me holding it wrong, but I figured it would be worth reporting nonetheless.
Environment
-
CPU: Ryzen 7 PRO 5850U
-
Kernel Version: 6.16.1
-
rustc -vVrustc 1.89.0 (29483883e 2025-08-04) binary: rustc commit-hash: 29483883eed69d5fb4db01964cdf2af4d86e9cb2 commit-date: 2025-08-04 host: x86_64-unknown-linux-gnu release: 1.89.0 LLVM version: 20.1.7
MRE Benchmark
// benches/historybuf.rs
use std::hint::black_box;
use divan::counter::BytesCount;
use heapless::HistoryBuf; // Replace with `HistoryBuffer` for heapless 0.8.0
const NEEDLE: &[u8] = b"needle451";
const WINDOW_SIZE: usize = NEEDLE.len();
fn historybuf_search_window_write(data: Vec<u8>) {
let mut search_window: HistoryBuf<u8, WINDOW_SIZE> = HistoryBuf::new();
for byte in data {
search_window.write(byte);
}
}
#[divan::bench]
fn bench_historybuf_search_window_write(bencher: divan::Bencher) {
let total_bytes: usize = 8 * 1024 * 1024;
let bytes_before_needle = total_bytes - NEEDLE.len();
bencher
.counter(BytesCount::new(total_bytes))
.with_inputs(|| {
let mut data = Vec::with_capacity(total_bytes);
data.extend(vec![b'\0'; bytes_before_needle]);
data.extend(NEEDLE);
data
})
.bench_values(|data| {
historybuf_search_window_write(black_box(data));
});
}
fn main() {
divan::main();
}# Cargo.toml
[package]
name = "heapless-perf-mre"
edition = "2024"
[dependencies]
heapless = "0.9.1"
[dev-dependencies]
divan = "0.1.21"
[profile.release]
lto = "fat"
codegen-units = 1
[[bench]]
name = "historybuf"
harness = falseResults
cargo bench --benches
heapless 0.8.0 (HistoryBuffer)
Running benches/historybuf.rs (target/release/deps/historybuf-2a85391490864b67)
Timer precision: 20 ns
historybuf fastest │ slowest │ median │ mean │ samples │ iters
╰─ bench_historybuf_search_window_write 24.61 µs │ 275.3 µs │ 119.5 µs │ 124.4 µs │ 100 │ 100
340.7 GB/s │ 30.46 GB/s │ 70.15 GB/s │ 67.37 GB/s │ │
heapless 0.9.1 (HistoryBuf)
Running benches/historybuf.rs (target/release/deps/historybuf-974f73ca8d39caa1)
Timer precision: 20 ns
historybuf fastest │ slowest │ median │ mean │ samples │ iters
╰─ bench_historybuf_search_window_write 6.823 ms │ 7.078 ms │ 6.87 ms │ 6.892 ms │ 100 │ 100
1.229 GB/s │ 1.185 GB/s │ 1.22 GB/s │ 1.217 GB/s │ │