termux-speech-to-text buffers progressive output

Open yesco opened this issue 4 years ago • 1 comments

Problem description termux-speech-to-text -p

doesn't print progressive output as it comes, but seems to buffer it. It comes out in a larger chunk.

Steps to reproduce Run the command

termux-speech-to-text -p

speak a lot, watch as no output comes until either the end, or when enough has buffered up.

Expected behavior I'd expect it to output each line as each new word is recognized. That's what progressive means.

Additional information I've tried to see if it's the terminal or script that buffers, so I wrapped the execution in:

stdbuf -i0 -o0 /data/data/com.termux/files/usr/libexec/termux-api SpeechToText

But there is no difference.

Aug 26 '21 12:08 yesco

adb logcat -c && adb logcat | grep -i 'SpeechToText'

Output:

06-06 17:08:29.502  9754  9754 E TermuxAPI.SpeechToTextAPI: onReceive
06-06 17:08:29.510  9754  9754 E TermuxAPI.SpeechToTextService: onCreate
06-06 17:08:29.515  9754  9754 E TermuxAPI.SpeechToTextService: Start listening
06-06 17:08:29.519  9754  9754 E TermuxAPI.SpeechToTextService: After start listening
06-06 17:08:29.522  9754  5574 E TermuxAPI.SpeechToTextService: onHandleIntent:
06-06 17:08:29.522  9754  5574 E TermuxAPI.SpeechToTextService: Intent { cmp=com.termux.api/.apis.SpeechToTextAPI$SpeechToTextService (has extras) }
06-06 17:08:29.522  9754  5574 E TermuxAPI.SpeechToTextService: Bundle[
06-06 17:08:29.522  9754  5574 E TermuxAPI.SpeechToTextService: socket_input: `6d61ee39ebf16df9-15ba-43ee-9ee7-f671be5cd220823bf0828827`
06-06 17:08:29.522  9754  5574 E TermuxAPI.SpeechToTextService: socket_output: `7b56f9c480f4c2b3-15ba-4407-ac8c-6137c013f826b9e561eab12c`
06-06 17:08:29.522  9754  5574 E TermuxAPI.SpeechToTextService: api_method: `SpeechToText`
06-06 17:08:29.522  9754  5574 E TermuxAPI.SpeechToTextService: ]
06-06 17:08:30.554  9754  9754 E TermuxAPI.SpeechToTextService: RecognitionListener#onPartialResults([])
06-06 17:08:30.995  9754  9754 E TermuxAPI.SpeechToTextService: RecognitionListener#onPartialResults([])
06-06 17:08:31.129  9754  9754 E TermuxAPI.SpeechToTextService: RecognitionListener#onPartialResults([bonjour])
06-06 17:08:31.564  9754  9754 E TermuxAPI.SpeechToTextService: RecognitionListener#onPartialResults([bonjour je])
06-06 17:08:31.775  9754  9754 E TermuxAPI.SpeechToTextService: RecognitionListener#onPartialResults([bonjour je m'appelle])
06-06 17:08:31.892  9754  9754 E TermuxAPI.SpeechToTextService: RecognitionListener#onPartialResults([bonjour je m'appelle])
06-06 17:08:32.060  9754  9754 E TermuxAPI.SpeechToTextService: RecognitionListener#onPartialResults([bonjour je m'appelle])
06-06 17:08:32.102  9754  9754 E TermuxAPI.SpeechToTextService: RecognitionListener#onPartialResults([bonjour je m'appelle])
06-06 17:08:32.270  9754  9754 E TermuxAPI.SpeechToTextService: RecognitionListener#onPartialResults([bonjour je m'appelle])
06-06 17:08:32.423  9754  9754 E TermuxAPI.SpeechToTextService: RecognitionListener#onPartialResults([bonjour je m'appelle])
06-06 17:08:32.599  9754  9754 E TermuxAPI.SpeechToTextService: RecognitionListener#onPartialResults([bonjour je m'appelle])
06-06 17:08:32.676  9754  9754 E TermuxAPI.SpeechToTextService: RecognitionListener#onPartialResults([bonjour je m'appelle Benjamin])
06-06 17:08:32.797  9754  9754 E TermuxAPI.SpeechToTextService: RecognitionListener#onPartialResults([bonjour je m'appelle Benjamin])
06-06 17:08:33.001  9754  9754 E TermuxAPI.SpeechToTextService: RecognitionListener#onPartialResults([bonjour je m'appelle Benjamin Loison])
06-06 17:08:33.118  9754  9754 E TermuxAPI.SpeechToTextService: RecognitionListener#onPartialResults([bonjour je m'appelle Benjamin Loison])
06-06 17:08:33.163  9754  9754 E TermuxAPI.SpeechToTextService: RecognitionListener#onPartialResults([bonjour je m'appelle Benjamin Loison et])
06-06 17:08:33.381  9754  9754 E TermuxAPI.SpeechToTextService: RecognitionListener#onPartialResults([bonjour je m'appelle Benjamin Loison et je])
06-06 17:08:33.498  9754  9754 E TermuxAPI.SpeechToTextService: RecognitionListener#onPartialResults([bonjour je m'appelle Benjamin Loison et je])
06-06 17:08:33.762  9754  9754 E TermuxAPI.SpeechToTextService: RecognitionListener#onPartialResults([bonjour je m'appelle Benjamin Loison et je parle])
06-06 17:08:33.817  9754  9754 E TermuxAPI.SpeechToTextService: RecognitionListener#onPartialResults([bonjour je m'appelle Benjamin Loison et je parle])
06-06 17:08:33.954  9754  9754 E TermuxAPI.SpeechToTextService: RecognitionListener#onPartialResults([bonjour je m'appelle Benjamin Loison et je parle])
06-06 17:08:34.072  9754  9754 E TermuxAPI.SpeechToTextService: RecognitionListener#onPartialResults([bonjour je m'appelle Benjamin Loison et je parle pour])
06-06 17:08:34.516  9754  9754 E TermuxAPI.SpeechToTextService: RecognitionListener#onPartialResults([bonjour je m'appelle Benjamin Loison et je parle pour combler])
06-06 17:08:34.698  9754  9754 E TermuxAPI.SpeechToTextService: RecognitionListener#onPartialResults([bonjour je m'appelle Benjamin Loison et je parle pour combler])
06-06 17:08:35.182  9754  9754 E TermuxAPI.SpeechToTextService: RecognitionListener#onPartialResults([bonjour je m'appelle Benjamin Loison et je parle pour combler voilà])
06-06 17:08:35.434  9754  9754 E TermuxAPI.SpeechToTextService: RecognitionListener#onPartialResults([bonjour je m'appelle Benjamin Loison et je parle pour combler voilà])
06-06 17:08:35.969  9754  9754 E TermuxAPI.SpeechToTextService: RecognitionListener#onEndOfSpeech()
06-06 17:08:35.973  9754  9754 E TermuxAPI.SpeechToTextService: onDestroy

Bash script:

IFS=
termux-speech-to-text -p | { x=1; while IFS= read -d'' -s -N 1 char; do
  [ $x ] && date | head -c -1 && echo -n ': '
  printf "$char"
  unset x
  [ "$char" == "
" ] && x=1
done; }

Output:

Fri Jun  6 17:08:35 CEST 2025: 
Fri Jun  6 17:08:36 CEST 2025: 
Fri Jun  6 17:08:36 CEST 2025: bonjour
Fri Jun  6 17:08:36 CEST 2025: bonjour je
Fri Jun  6 17:08:36 CEST 2025: bonjour je m'appelle
Fri Jun  6 17:08:36 CEST 2025: bonjour je m'appelle
Fri Jun  6 17:08:36 CEST 2025: bonjour je m'appelle
Fri Jun  6 17:08:36 CEST 2025: bonjour je m'appelle
Fri Jun  6 17:08:36 CEST 2025: bonjour je m'appelle
Fri Jun  6 17:08:36 CEST 2025: bonjour je m'appelle
Fri Jun  6 17:08:36 CEST 2025: bonjour je m'appelle
Fri Jun  6 17:08:36 CEST 2025: bonjour je m'appelle Benjamin
Fri Jun  6 17:08:36 CEST 2025: bonjour je m'appelle Benjamin
Fri Jun  6 17:08:36 CEST 2025: bonjour je m'appelle Benjamin Loison
Fri Jun  6 17:08:36 CEST 2025: bonjour je m'appelle Benjamin Loison
Fri Jun  6 17:08:36 CEST 2025: bonjour je m'appelle Benjamin Loison et
Fri Jun  6 17:08:36 CEST 2025: bonjour je m'appelle Benjamin Loison et je
Fri Jun  6 17:08:36 CEST 2025: bonjour je m'appelle Benjamin Loison et je
Fri Jun  6 17:08:36 CEST 2025: bonjour je m'appelle Benjamin Loison et je parle
Fri Jun  6 17:08:36 CEST 2025: bonjour je m'appelle Benjamin Loison et je parle
Fri Jun  6 17:08:36 CEST 2025: bonjour je m'appelle Benjamin Loison et je parle
Fri Jun  6 17:08:36 CEST 2025: bonjour je m'appelle Benjamin Loison et je parle pour
Fri Jun  6 17:08:36 CEST 2025: bonjour je m'appelle Benjamin Loison et je parle pour combler
Fri Jun  6 17:08:36 CEST 2025: bonjour je m'appelle Benjamin Loison et je parle pour combler
Fri Jun  6 17:08:36 CEST 2025: bonjour je m'appelle Benjamin Loison et je parle pour combler voilà
Fri Jun  6 17:08:36 CEST 2025: bonjour je m'appelle Benjamin Loison et je parle pour combler voilà

Source: Benjamin_Loison/linux/issues/82

termux-api/blob/7e225c97f58018d3f78d6fae17470782aadd8c17/app/src/main/java/com/termux/api/apis/SpeechToTextAPI.java#L26

termux-api/blob/7e225c97f58018d3f78d6fae17470782aadd8c17/app/src/main/java/com/termux/api/apis/SpeechToTextAPI.java#L76

termux-api/blob/7e225c97f58018d3f78d6fae17470782aadd8c17/app/src/main/java/com/termux/api/apis/SpeechToTextAPI.java#L165 seems involved.

Jun 06 '25 15:06 Benjamin-Loison