Skip to content

Commit

Permalink
Introduce start session algorithm (#138)
Browse files Browse the repository at this point in the history
* Introduce start session algorithm

* Nits

* Nit #2

* Use InvalidStateError instead of UnknownError

* Fix typo
  • Loading branch information
beaufortfrancois authored Feb 18, 2025
1 parent 59f74d4 commit 6356249
Showing 1 changed file with 30 additions and 8 deletions.
38 changes: 30 additions & 8 deletions index.bs
Original file line number Diff line number Diff line change
Expand Up @@ -99,7 +99,7 @@ This does not preclude adding support for this as a future API enhancement, and
User consent can include, for example:
<ul>
<li>User click on a visible speech input element which has an obvious graphical representation showing that it will start speech input.</li>
<li>Accepting a permission prompt shown as the result of a call to <code>SpeechRecognition.start</code>.</li>
<li>Accepting a permission prompt shown as the result of a call to <a method for=SpeechRecognition>start()</a>.</li>
<li>Consent previously granted to always allow speech input for this web page.</li>
</ul>
</li>
Expand Down Expand Up @@ -142,6 +142,14 @@ This does not preclude adding support for this as a future API enhancement, and
The term "final result" indicates a SpeechRecognitionResult in which the final attribute is true.
The term "interim result" indicates a SpeechRecognitionResult in which the final attribute is false.

{{SpeechRecognition}} has the following internal slots:

<dl dfn-type=attribute dfn-for="SpeechRecognition">
: <dfn>[[started]]</dfn>
::
A boolean flag representing whether the speech recognition started. The initial value is <code>false</code>.
</dl>

<xmp class="idl">
[Exposed=Window]
interface SpeechRecognition : EventTarget {
Expand Down Expand Up @@ -277,15 +285,19 @@ See <a href="https://lists.w3.org/Archives/Public/public-speech-api/2012Sep/0072

<dl>
<dt><dfn method for=SpeechRecognition>start()</dfn> method</dt>
<dd>When the start method is called it represents the moment in time the web application wishes to begin recognition.
When the speech input is streaming live through the input media stream, then this start call represents the moment in time that the service must begin to listen.
Once the system is successfully listening to the recognition the user agent must raise a start event.
If the start method is called on an already started object (that is, start has previously been called, and no <a event for=SpeechRecognition>error</a> or <a event for=SpeechRecognition>end</a> event has fired on the object), the user agent must throw an "{{InvalidStateError!!exception}}" {{DOMException}} and ignore the call.</dd>
<dd>
1. Let <var>requestMicrophonePermission</var> to <code>true</code>.
1. Run the <a>start session algorithm</a> with <var>requestMicrophonePermission</var>.
</dd>

<dt><dfn method for=SpeechRecognition>start({{MediaStreamTrack}} audioTrack)</dfn> method</dt>
<dd>The overloaded start method does the same thing as the parameterless start method except it performs speech recognition on provided {{MediaStreamTrack}} instead of the input media stream.
If the {{MediaStreamTrack/kind}} attribute of the {{MediaStreamTrack}} is not "audio" or the {{MediaStreamTrack/readyState}} attribute is not "live", the user agent must throw an "{{InvalidStateError!!exception}}" {{DOMException}} and ignore the call.
Unlike the parameterless start method, the user agent does not check whether [=this=]'s [=relevant global object=]'s [=associated Document=] is [=allowed to use=] the [=policy-controlled feature=] named "<code>microphone</code>".</dd>
<dd>
1. Let <var>audioTrack</var> be the first argument.
1. If <var>audioTrack</var>'s {{MediaStreamTrack/kind}} attribute is NOT <code>"audio"</code>, throw an {{InvalidStateError}} and abort these steps.
1. If <var>audioTrack</var>'s {{MediaStreamTrack/readyState}} attribute is NOT <code>"live"</code>, throw an {{InvalidStateError}} and abort these steps.
1. Let <var>requestMicrophonePermission</var> be <code>false</code>.
1. Run the <a>start session algorithm</a> with <var>requestMicrophonePermission</var>.
</dd>

<dt><dfn method for=SpeechRecognition>stop()</dfn> method</dt>
<dd>The stop method represents an instruction to the recognition service to stop listening to more audio, and to try and return a result using just the audio that it has already received for this recognition.
Expand All @@ -309,6 +321,16 @@ See <a href="https://lists.w3.org/Archives/Public/public-speech-api/2012Sep/0072

</dl>

<p>When the <dfn>start session algorithm</dfn> with <var>requestMicrophonePermission</var> is invoked, the user agent MUST run the following steps:

1. If the [=current settings object=]'s [=relevant global object=]'s [=associated Document=] is NOT [=fully active=], throw an {{InvalidStateError}} and abort these steps.
1. If {{[[started]]}} is <code>true</code> and no <a event for=SpeechRecognition>error</a> or <a event for=SpeechRecognition>end</a> event has fired, throw an {{InvalidStateError}} and abort these steps.
1. Set {{[[started]]}} to <code>true</code>.
1. If <var>requestMicrophonePermission</var> is <code>true</code> and [=request permission to use=] "<code>microphone</code>" is [=permission/"denied"=], abort these steps.
1. Once the system is successfully listening to the recognition, [=fire an event=] named <a event for=SpeechRecognition>start</a> at [=this=].

</p>

<h4 id="speechreco-events">SpeechRecognition Events</h4>

<p>The DOM Level 2 Event Model is used for speech recognition events.
Expand Down

0 comments on commit 6356249

Please sign in to comment.