speech_recognition
speech_recognition copied to clipboard
Recognizer.recognize_vosk is slow for bigger models
Steps to reproduce
- On smaller models, the
Recognizer.recognize_voskmethod is pretty fast, but when I use a bigger model, it slows down significantly. - I copy paste the current
Recognizer.recognize_voskcode and modified it slightly by storingKaldiRecognizerintoRecognizersimilar to the vosk's Model instance. - The code is significantly faster
Expected behaviour
The same speed should be observe
Actual behaviour
A Time taken: 5.882570743560791 # <-- initial Model loaded does not count.
B Time taken: 0.6530005931854248
A Time taken: 0.18399715423583984
B Time taken: 0.5320005416870117
A Time taken: 0.5459988117218018
B Time taken: 1.0760009288787842
Full Code
import os
import time
import speech_recognition as sr
r = sr.Recognizer()
def recognize_vosk(self, audio_data, language='en'):
from vosk import Model, KaldiRecognizer
assert isinstance(audio_data, sr.AudioData), "Data must be audio data"
if not hasattr(self, 'vosk_model'):
if not os.path.exists("model"):
return "Please download the model from https://github.com/alphacep/vosk-api/blob/master/doc/models.md and unpack as 'model' in the current folder."
self.vosk_model = Model("model")
if not hasattr(self, 'vosk_model_kaldirecognizer'):
self.vosk_model_kaldirecognizer = rec = KaldiRecognizer(self.vosk_model, 16000)
else:
rec = self.vosk_model_kaldirecognizer
rec.Reset()
rec.AcceptWaveform(audio_data.get_raw_data(convert_rate=16000, convert_width=2))
finalRecognition = rec.FinalResult()
return finalRecognition
def timeit(callback, *args):
start = time.time()
result = callback(*args)
end = time.time()
return end - start
def on_word(_, audio):
a_time_taken = timeit(recognize_vosk, r, audio)
b_time_taken = timeit(r.recognize_vosk, audio)
print("A Time taken:", a_time_taken)
print("B Time taken:", b_time_taken)
source = sr.Microphone()
print("Listening...")
r.pause_threshold = 1
back = r.listen_in_background(source, on_word, phrase_time_limit=2)
while True:
pass
Discussion
It seems like caching KaldiRecognizer and using Reset method speeds up it significantly. So it's an actual issue on the recognizer_vosk code