Mic doesn't turn off if applyPolyfill is set
Describe the bug
I'm trying to set up polyfill using react-speech-recognition and this library. My mic's behavior ok without using the polyfill.
To Reproduce
My code is almost the same as in the example. I have only changed the continuous property.
const appId = process.env.REACT_APP_SPEECHLY_ID;
const SpeechlySpeechRecognition = createSpeechlySpeechRecognition(appId);
SpeechRecognition.applyPolyfill(SpeechlySpeechRecognition);
const SomeComponent = () => {
const {
transcript,
resetTranscript,
listening,
browserSupportsSpeechRecognition,
} = useSpeechRecognition();
useEffect(() => {
if (!listening) {
SpeechRecognition.stopListening();
setTitleState(prevState);
}
}, [listening]);
useEffect(() => {
if (transcript) {
setValue('do something here');
}
}, [transcript]);
const handleOnClikOrTapStart = () => {
setPrevState(titleState);
if (browserSupportsSpeechRecognition) {
resetTranscript();
setTitleState(STATE_VOICE);
SpeechRecognition.startListening({
language: "en-US",
});
}
};
return (
<Button
className={classnames("voice-btn", {
active: listening,
})}
btnType="icon"
onMouseDown={handleOnClikOrTapStart}
onTouchStart={handleOnClikOrTapStart}
disabled={!browserSupportsSpeechRecognition}
>
<MicIcon />
</Button>
)
}
Expected behavior
I need to have my mic turned off. But it is working all the time after clicking the button.
Environment
- Platform: [Desktop]
- OS: [macOS (latest)]
- Browser [chrome, safari]
- Version [latest]
- Package version [latest]
Hi @newdisease thanks for raising this issue!
You've encountered an interesting difference between the browser's native speech recognition and the Speechly polyfill. When SpeechRecognition.stop is called in the native version, the browser will stop transcribing and turn off the microphone. In the polyfill, voice data will stop being streamed to the Speechly API and transcription will end (i.e. the listening state will become false) but the microphone will remain active. However, the functionality should be identical between the two as no microphone data is being processed while transcription is turned off - the only difference is that the user will see the recording icon on their browser tab. Keeping the microphone on probably has a slight efficiency advantage in the case where transcription is needed again later. That said, I could understand if this affected user trust if they asked for the microphone to be turned off but could still see the recording icon.
Besides the microphone staying on with the polyfill, have you seen any unexpected behaviour differences that affect how your web app works?
Hi. Thanks for your answer!
After recording, the microphone does not work. However, the recording icon still appears, confusing my users.
Is it possible somehow to fix that?
@newdisease I totally understand, and have raised an issue on the Speechly Browser Client library for you to follow. If we're able to work out a suitable change to the Browser Client, I'll update it in the polyfill.
In the meantime, the best I can suggest is to make it as clear as possible in your UI when no voice data is being processed (i.e. when the web app is not "listening"). In most cases, users will pay more attention to the UI than the recording icon in the browser tab, so they shouldn't notice the discrepancy too often. But I do understand your concern about any confusion or trust/privacy issues that could cause.
Hi @JamesBrill , I investigated a solution that allows closing the microphone after each utterance without @speechly/browser-client API changes. If you find this approach working work we could work towards creating a PR out of this (currently many unit tests fail as internal initialize function was retired).
Here's a summary of changes:
-
start()always waits for microphone to be initialized and reattached. This works well on modern browsers but at least old iOS 12 suffers from considerable lag. -
stop()always detaches and closes the mic. - there is a new
listeningPromisethat orchestrates start and stop calls into a queue so they work neatly in stress testing. Alsotranscribingstate change is immediate, used to ensure that duplicate start/stop calls are not made.
And here's the diff for git apply createSpeechRecognition.ts.patch:
diff --git a/src/createSpeechRecognition.ts b/src/createSpeechRecognition.ts
index 4828eec..2e0258d 100644
--- a/src/createSpeechRecognition.ts
+++ b/src/createSpeechRecognition.ts
@@ -32,11 +32,11 @@ export const createSpeechlySpeechRecognition = (appId: string): SpeechRecognitio
return class SpeechlySpeechRecognition implements SpeechRecognition {
static readonly hasBrowserSupport: boolean = browserSupportsAudioApis
-
private readonly client: BrowserClient
- private clientInitialised = false
+ private readonly microphone: BrowserMicrophone
private aborted = false
private transcribing = false
+ private listenPromise: Promise<void> | null = null
continuous = false
interimResults = false
@@ -46,15 +46,14 @@ export const createSpeechlySpeechRecognition = (appId: string): SpeechRecognitio
constructor() {
this.client = new BrowserClient({ appId })
+ this.microphone = new BrowserMicrophone()
this.client.onSegmentChange(this.handleResult)
}
public start = async (): Promise<void> => {
try {
this.aborted = false
- await this.initialise()
- await this.client.start()
- this.transcribing = true
+ await this._start()
} catch (e) {
if (e === ErrNoAudioConsent) {
this.onerror(MicrophoneNotAllowedError)
@@ -73,31 +72,49 @@ export const createSpeechlySpeechRecognition = (appId: string): SpeechRecognitio
await this._stop()
}
- private readonly initialise = async (): Promise<void> => {
- if (!this.clientInitialised) {
- const microphone = new BrowserMicrophone()
- await microphone.initialize()
- const { mediaStream } = microphone
+ private readonly _start = async (): Promise<void> => {
+ if (this.transcribing) {
+ return
+ }
+
+ this.transcribing = true
+
+ this.listenPromise = (async () => {
+ // Wait for earlier task(s) to complete, effectively adding to a task queue
+ await this.listenPromise
+ await this.microphone.initialize()
+ const { mediaStream } = this.microphone
if (mediaStream === null || mediaStream === undefined) {
throw ErrDeviceNotSupported
}
await this.client.attach(mediaStream)
- this.clientInitialised = true
- }
+ await this.client.start()
+ })()
+
+ await this.listenPromise
}
private readonly _stop = async (): Promise<void> => {
if (!this.transcribing) {
return
}
- await this.initialise()
- try {
- await this.client.stop()
- this.transcribing = false
- this.onend()
- } catch (e) {
- // swallow errors
- }
+
+ this.transcribing = false
+
+ this.listenPromise = (async () => {
+ // Wait for earlier task(s) to complete, effectively adding to a task queue
+ await this.listenPromise
+ try {
+ await this.client.stop()
+ await this.client.detach();
+ await this.microphone.close();
+ this.onend()
+ } catch (e) {
+ // swallow errors
+ }
+ })()
+
+ await this.listenPromise
}
private readonly handleResult = (segment: Segment): void => {
Hi! any updates on this? I'm having the same issue with my app, Although I’m not using the mic anymore, the "mic in use" of the phone is still active, and users think they are still being heard. It’s a big "security" problem for me.
@JesusADS Don't worry, I haven't forgotten about this issue - I've just lacked free time for open source recently. I've applied @arzga 's patch in a Pull Request to get the ball rolling. 😃
I would appreciate it very much if this PR could be applied.
Thanks
The PR that fixes this is now ready for review - apologies for the delay. https://github.com/speechly/speech-recognition-polyfill/pull/31 @arzga @bigdatabaracus
@newdisease @JesusADS @kla-ko A fix for this has been released in version 1.3.0 - let us know if this resolves the issue for you.
Works on Chrome/Desktop, Opera/Desktop, Safari/Mobile, Samsung/Mobile
Thanks