Speed up shader,execution,expression tests with async pipelines #3125
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Some of these tests are among the longest running that we have, so hopefully this has a dramatic impact on runtimes. Profiling shows that almost all the time is being spent waiting for pipeline creation. By using createComputePipelineAsync, however, we ensure that multiple shaders in a batch can be compiling at once, which drops test run times significantly.
For example, the
webgpu:shader,execution,expression,binary,af_matrix_subtraction:matrix:inputSource="const";cols=4;rows=4
test, one of the longest running in the entire CTS, was taking 5 minutes to run on my Windows machine. With this patch it completes in46 seconds80 seconds on the same device.In general I'm seeing speedups of
5x-6x(sorry, it's only 4x. See comments below) on the tests that I've tried.To demonstrate why, we can look at this chunk of a trace from the shorter
webgpu:shader,execution,expression,binary,af_matrix_subtraction:matrix:inputSource="const";cols=2;rows=2
test, which was previously taking 8 seconds to run.The blue chunks are all pipeline compiles, and you can see they dominate the runtime and are being performed serially. With this patch it completes in about 2 seconds and the tracing produces results like this:
Same amount of blue, but now more parallel!
Requirements for PR author:
.unimplemented()
./** documented */
and new helper files are found inhelper_index.txt
.Requirements for reviewer sign-off:
When landing this PR, be sure to make any necessary issue status updates.