-
Notifications
You must be signed in to change notification settings - Fork 313
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add small compute examples illustrating new WGSL primitives for AI #350
Comments
CC @dneto0 |
For dp4a:
|
dp4a accelerates matrix multiplication right? Even a basic matrix multiplication sample could be enough for a sample. The sample could even just display some text. But sobel or any other simple convolution would make it more compelling. |
Also it should have a toggle to enable/disable dp4a and hopefully see some performance improvement. |
Is dp4a available in the current version of WebGPU? |
dp4a is available starting in Chromium M123. So today, that would be Chrome Beta and newer. |
If possible, I'd like to try my hand at this issue, at least for the next week (sorry about the timeline, day job is gonna day job). Sobel filter is a good place to start I think. EDIT: 'GPGPU Compute Category' or 'Features Category'? |
Just want to make sure I understand the assignment, the intended use of dp4a here. Instead of writing, say, this for our sobel filter.... (below is in pseudo-wgsl)
We should do something like this?
|
I suspect a Sobel filter is simple enough that it's limited by memory bandwidth instead of computation. |
Somebody else should take this on, I understand the functionality, but I'm struggling with the quantization of the dp4a result back to something usable. |
@beaufortfrancois requested that some small examples be published here showing how to use the new WGSL primitives aimed at AI/ML workloads:
shader-f16
, DP4A, and soon, subgroups. Could we consider this?Not sure what would be the most compelling - perhaps something with some visual output, and running a microbenchmark against the fallback WGSL code, assuming the feature is actually supported?
The text was updated successfully, but these errors were encountered: