-
-
Notifications
You must be signed in to change notification settings - Fork 3.1k
Description
Currently, all unit tests of a given compilation are compiled into the same binary, and the test runner iterates over all of them and runs them one by one. This saves time and disk space, eliminating the overhead of building an independent executable for each test.
The same is true for fuzz tests. When rebuilding a compilation in fuzz mode, all fuzz tests (and all other non-fuzzing unit tests) are compiled into the same executable. While this saves time and disk space at first, these benefits come at the cost of a larger points-of-interest array. This array is the set of virtual memory addresses for which the fuzzer is trying to reach. By keeping this array smaller, each individual fuzz run can be slightly faster. Multiplied by many fuzz runs, the tradeoff tilts in favor of separate executable for each fuzz test.
To further optimize fuzz run performance, we also want to minimize overhead of calling into the function. For instance ideally the hot fuzz loop will be a direct call into the code being tested, rather than dereferencing a function pointer. This is constrained by the std.testing API for how to declare something as a fuzz test.
This issue changes from the current API:
pub inline fn fuzz(context: anytype, comptime testOne: fn (context: @TypeOf(context), input: []const u8) anyerror!void, options: FuzzInputOptions) anyerror!voidWhich is used like this:
test "example fuzz test" {
try std.testing.fuzz({}, testOne, .{});
}Into this signature:
pub fn fuzz(comptime testOne: *const fn (input: []const u8) anyerror!void, input: []const u8) anyerror!voidNote that the context is gone. Usage is now like this:
test "example unit test" {
try std.testing.fuzz(myFuzzTest, "example input");
}Importantly, "example unit test" is no longer itself a fuzz test. Instead, it has a side-effect of informing the build system about the existence of a fuzz test, as well as one input for the corpus. So, the following unit tests may also exist:
test "another unit test" {
try std.testing.fuzz(myFuzzTest, "same fuzz test, different input");
try std.testing.fuzz(myFuzzTest, "yet a third input to the same fuzz test");
}
test "third unit test" {
try std.testing.fuzz(myFuzzTest, "this fourth input still does not declare a unique fuzz test");
//try std.testing.fuzz(foobar, "ok now this one creates a new fuzz test"); // not allowed for reasons explained below
}So this fundamentally changes the algorithm for build system discovering fuzz tests - they no longer correspond one-on-one to a unit test. This handles the common case that unit test coverage acts as a good initial corpus for fuzzing. For instance, consider this one-liner added to parser_test.zig:
--- a/lib/std/zig/parser_test.zig
+++ b/lib/std/zig/parser_test.zig
@@ -6415,6 +6415,7 @@ fn testParse(source: [:0]const u8, allocator: mem.Allocator, anything_changed: *
return formatted;
}
fn testTransformImpl(allocator: mem.Allocator, fba: *std.heap.FixedBufferAllocator, source: [:0]const u8, expected_source: []const u8) !void {
+ std.testing.fuzz(fuzzTestOneParse, source);
// reset the fixed buffer allocator each run so that it can be re-used for each
// iteration of the failing index
fba.reset();This means that all the unit test cases from the rest of the file will be collected into the initial corpus.
Finally, circling back to optimizing the fuzz function - since this API declares a comptime-known function pointer as a fuzz test, it means that when recompiling a test binary in fuzz mode, specifically that one function that was referenced can be wrapped in the test runner like this:
export fn zig_fuzzer_one(input_ptr: [*]const u8, input_len: usize) void {
theOneAndOnlyFuzzFunction(input_ptr[0..input_len]) catch |err| { ... };
}And of course this will be inlined and optimized so that it has no overhead.
In order to accomplish this without complicated and brittle machinery inside the compiler, std.testing.fuzz will note the first unit test corresponding to any particular fuzz function pointer. A second, different fuzz function may not be declared in the same unit test. This means that fuzz functions can be identified by unit test index. When recompiling unit tests for purpose of exposing a single fuzz function, the unit test index can be used for conditional compilation to eliminate most dead code, and then the std.testing.fuzz function will simply @export the wrapper. This only works if unit tests are forbidden from declaring more than one fuzz test (not to be confused with more than one corpus input).
This restriction could be lifted by adding a more advanced fuzz test declaration function which includes a comptime string which makes the exported function unique. This id then becomes part of the tuple that identifies a fuzz test (unit test index, comptime string fuzz test id), and then is used as conditional compilation to avoid compiling multiple fuzz tests declared in the same unit test into one executable.
By doing things this way, unit tests and fuzz tests are unified so that the unit tests can test additional things such as expected result, while also declaring the initial corpus data for property-based testing.