Could this component work for Live use cases?
Is your feature request related to a problem? Please describe. Low priority,
Could this component work getting as input
- a live stream of audio or video for the media
- and interim results for the STT
eg there are some provider like IBM, Speechmatics, AWS, etc.. that can return realtime STT results.
What would it take for this component to be able to handle this use case? Or is it too much of a stretch and better of leaving that use case for another component that perhaps re-uses some of the logic etc..?
Describe the solution you'd like
The interim results could be appended at the end of the transcript?
Describe alternatives you've considered Considering this outside of the scope of this component, and have a separate one to hanlde this use case.
Additional context Low priority, just flagging it in case there's interest around this.
I wonder how to deal with interim results, what if the user is already editing? replace things and move the cursor?
Yeah, fair point, I thought interim results could be appended at the end incrementally? Without effecting cursor and edits on previous results?
you need not just to append but replace the interim result, so you kinda need to merge based on timings, my worries is with user interfering with content and what does this append/merge means for the undo/redo stack (the user should not be allowed to undo the transcription)
Are we assuming the interim results change over time?
Meaning if I get result for 14 sec to 20 sec time range once, I might receive a more up to date version later? Eg once there are more results available and the STT system has had a chance to do some adjustment across the rest of the text?
https://github.com/pietrop/slate-transcript-editor/pull/1
Might have found a way to do this with slateJs 🤞