Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Set of ops supported must be more comprehensive #23

Closed
anssiko opened this issue Oct 11, 2022 · 1 comment
Closed

Set of ops supported must be more comprehensive #23

anssiko opened this issue Oct 11, 2022 · 1 comment

Comments

@anssiko
Copy link
Member

anssiko commented Oct 11, 2022

(Related to the WebML WG Charter in development at #19)

We discussed "v2" use cases for WebNN and @wchao1115 shared the following feedback:

… for v2, one of the constant feedback from our external partners when discussing WebNN for their use case has been the ops
… the set of ops supported must be more comprehensive
… this needs to be more explicit goal, this is important
… related to that, use cases around transformers

The current charter Scope enumerates a few common ones: "convolution, pooling, softmax, normalization, fully connected, activation, recurrent neural network (RNN) and long short-term memory (LSTM)". This is not meant to be an all inclusive list and does give the WG ability to adapt to the changes in this landscape.

At minimum, we should review the bullets in the Scope section, and see whether to explicitly mention some of the more recent work such as transformers. We want to give enough detail to give good direction without constraining the WG too much. The list of ops mentioned in the charter would be open-ended.

@anssiko
Copy link
Member Author

anssiko commented Dec 8, 2022

The WG's work is motivated by the compelling user experiences it enables to web users. To that end, I'm proposing to add the following informative text to the Motivation and Background section:

Computer Vision enables computers to gain understanding from images or videos, Natural Language Processing enables interaction between computers and human languages, and Speech Recognition enables computers to recognize and translate spoken language into text. Bringing these experiences to the web in a privacy-preserving manner requires efficient machine learning inference capabilities built into the browser.

I think in addition we may want to clarify in the Scope section that the list of ops enumerated are examples of more established ops and the WG wants to give priority to ops that accelerate the above mentioned user experiences in CV, NLP, and Speech Recognition. Thus I'm proposing to add this non-binding text into the Scope section following the bullet list:

This Working Group puts priority on building blocks required by well-known model architectures in the fields of Computer Vision, Natural Language Processing and Speech Recognition."

anssiko added a commit that referenced this issue Jan 9, 2023
Move examples of well-known model architectures from the bullet
list into the text section that talks about priority use cases.
Add transformers as another example.

Grammar fix: s/Allow/Allows

Fix #23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant