pliers icon indicating copy to clipboard operation
pliers copied to clipboard

GoogleSpeechAPIConverter now works asynchronously for long-running inputs

Open tyarkoni opened this issue 6 years ago • 4 comments

For audio clips > 1 minute, Google's Cloud Speech-to-Text API now requires users to use the asynchronous long_running_recognize method instead of the synchronous recognize method. This will require us to check the duration of the audio and make an asynchronous call if needed.

tyarkoni avatar Apr 06 '19 19:04 tyarkoni

@adelavega, is this a very recent change (doesn't look like it...), or have you been working around it in NeuroScout by chopping up audio files or doing something else?

tyarkoni avatar Apr 06 '19 19:04 tyarkoni

Ugh, this is really annoying from our perspective. It looks like the Speech-to-Text API doesn't even work with an external uri—it requires files to be in Google Cloud Storage. That's going to add a whole layer of complexity to pliers if there's no way around it...

tyarkoni avatar Apr 06 '19 19:04 tyarkoni

Well I haven't really been using it for one. I've been avoiding speech to text, as I can usually get a transcript. And when I did use it, I used IBM Watson.

adelavega avatar Apr 06 '19 22:04 adelavega

I may open a PR soon for an really good and accurate speech-to-text API (😉) thats fairly easy to use (still asynchronous).

qmac avatar Apr 07 '19 05:04 qmac