-
Notifications
You must be signed in to change notification settings - Fork 48
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[investigation] buffer sharing between GPU and ML accelerator #33
Comments
Some updates:
Rebased WebNN POC to 80.0.3960.0 for WebGPU D3D12 support. There is an issue that TF.js WebGPU crashes due to lack of read-only storage buffer support. Workaround it by removing the
Implemented WebGPU-WebNN interop on Windows with same API as macOS prototype. The WebGPU backend of D3D12 and WebNN backend on DirectML share buffers via D3D12Resource. The test results (with above workaround of TF.js WebGPU backend) are:
The test platform configuration is
Leveraged DXCore API to enumerate adapters that support compute-only devices, e.g. ML accelerator. When web app compiles WebNN graph with // Create a WebNN graph contains conv2d
const graph = await createWebNNConv(filterValue, noBias, noRelu);
const compilation = await graph.createCompilation();
// Compiles WebNN graph for VPU
compilation.setPreference(nn.LOW_POWER);
await compilation.finish();
const execution = await compilation.createExecution();
// input and output are TypedArray
execution.setInput(0, input);
execution.setOutput(0, output);
// Executes WebNN graph on VPU
await execution.startCompute(); If the compilation preference is |
Per the discussion of Dec 5 CG call, the next step of the investigation is to run
For buffer sharing cross GPU and VPU:
As mentioned by @RafaelCintron in meeting, this usage is not recommended as it could be very slow. If the ML accelerator cannot do custom ops, web apps could still use |
A solution to the problem of buffer-sharing is proposed in #482. Can we close this issue? |
@huningxin PTAL at #688 and consider merging/closing this issue - we can track the latest interop proposal there. |
For WebNN interoperability for custom op support, so far, we have done the investigation and report out for WebNN-WASM interop and WebNN-WebGPU interop.
According to the WebNN interop investigation next steps discussion in WebML CG call on 3 Oct, the participants were interested in the buffer sharing between GPU and ML accelerator. Opening this issue to capture the requirement as well as share the status and data.
The idea is that WebNN allows to run expensive ops (e.g. conv2d) on ML accelerator and share buffer to WebGPU compute shader to run custom ops (e.g. add/relu). It can be illustrated by following code sample.
Per recommendation from @walrusmcd (thanks!), the investigation will initially target the AI on the PC Devkit. This device has both GPU and VPU (as an example of ML accelerator) that are supported D3D12 and DirectML API. The Chromium WebNN POC will be enhanced to support above scenario.
There are some dependencies need to be work on:
Currently, we have done the rebase and get basic VPU work in WebNN/DML backend. We'll update here once we make progress on the WebGPU-WebNN interop on D3D12/DML.
All, please kindly let me know whether I miss anything.
The text was updated successfully, but these errors were encountered: