Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create speech-recognition-context.md #140

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

yrw-google
Copy link

Add an explainer for the new speech recognition context feature

@evanbliu evanbliu requested a review from padenot February 19, 2025 23:36
Copy link
Member

@padenot padenot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A very welcome addition, thanks for the initial design. Let's iterate a bit on the API shape, but I'm excited about this.



### 2. **SpeechRecognitionPhraseList**
This interface holds a list of `SpeechRecognitionPhrase` and supports adding more `SpeechRecognitionPhrase` to the list.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this just sequence<SpeechRecognitionPhrase> ?

Copy link
Author

@yrw-google yrw-google Feb 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I experimented with using sequence a bit. I think it is feasible to put sequence<SpeechRecognitionPhrase> inside SpeechRecognitionContext and get rid of SpeechRecognitionPhraseList, but that means we need to move other methods like the definitions of length (which is somehow required by blink/v8 if it detects an array) and addItem from SpeechRecognitionPhraseList into SpeechRecognitionContext too. In that case if we add more types of data to SpeechRecognitionContext in the future, it might become confusing, and we won't be able to support another array inside SpeechRecognitionContext because the definition of length needs to duplicate.

From the IDL examples I can find, seems like it is a common practice to create a new ObjectList interface to support a new Object interface (e.g. DataTransferItemList), and sequence is used more often in a dictionary. I'm not sure if we should make SpeechRecognitionPhrase and SpeechRecognitionContext become dictionary instead, so that their relationship will be as simple as SpeechRecognitionContext contains a sequence of SpeechRecognitionPhrase. We want to perform data validation on each SpeechRecognitionPhrase so using dictionary may oversimplify things? Let me know what you think!

Comment on lines 70 to 110
const recognition = new SpeechRecognition();
recognition.start();
var list = new SpeechRecognitionPhraseList();
list.addItem(new SpeechRecognitionPhrase("updated text", 2.0));
var context = new SpeechRecognitionContext(list);
recognition.updateContext(context);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do we lose by simply doing:

Suggested change
const recognition = new SpeechRecognition();
recognition.start();
var list = new SpeechRecognitionPhraseList();
list.addItem(new SpeechRecognitionPhrase("updated text", 2.0));
var context = new SpeechRecognitionContext(list);
recognition.updateContext(context);
const recognition = new SpeechRecognition();
recognition.start();
var list = [{text: "update text", weight: 2.0}];
recognition.updateContext(list);

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The same interface simplification goes for the other example.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you still want to simplify things like this, can you tell me how IDL will look like in this case? We also want to perform data validation on each SpeechRecognitionPhrase, so using new SpeechRecognitionPhrase() helps us throw an error as we validate there, otherwise we would need to hold on validation until updateContext() is called?

if (event.error == "recognition-context-not-supported") {
console.log("Recognition context is not supported: ", event);
}
};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this example leaving the actual use of the context that would lead to an error? I'm not sure if I follow.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I rewrote and added more sample codes to hopefully make it clear. Please take a look again and let me know if additional explanation is needed!

yrw-google added a commit to yrw-google/web-speech-api that referenced this pull request Feb 24, 2025
Explainer for speech recognition context is added in WebAudio#140
yrw-google added a commit to yrw-google/web-speech-api that referenced this pull request Feb 24, 2025
Explainer for speech recognition context is added in WebAudio#140
yrw-google added a commit to yrw-google/web-speech-api that referenced this pull request Mar 4, 2025
Explainer for speech recognition context is added in WebAudio#140
yrw-google added a commit to yrw-google/web-speech-api that referenced this pull request Mar 4, 2025
Explainer for speech recognition context is added in WebAudio#140
yrw-google added a commit to yrw-google/web-speech-api that referenced this pull request Mar 4, 2025
Explainer for speech recognition context is added in WebAudio#140
yrw-google added a commit to yrw-google/web-speech-api that referenced this pull request Mar 4, 2025
Explainer for speech recognition context is added in WebAudio#140
yrw-google added a commit to yrw-google/web-speech-api that referenced this pull request Mar 4, 2025
Explainer for speech recognition context is added in WebAudio#140
Add an explainer for the new speech recognition context feature
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants