can the whisper stream support input audio files? like pcm, wav ... format .
You can use a tool like ffmpeg or avconv to convert any audio or video format to Whisper! Try something like this:
$ ffmpeg -i video.mp4 -f wav -ar 16000 - | ./main -m path/to/model.ggml.bin -
Note the trailing - on both commands, which instructs ffmpeg to write the wav file to stdout and instructs whisper to read from - on stdin.
got it, thanks.
Kimmy @.***> 于2023年4月21日周五 23:02写道:
You can use a tool like ffmpeg or avconv to convert any audio or video format to Whisper! Try something like this:
$ ffmpeg -i video.mp4 -f wav -ar 16000 - | ./main -m path/to/model.ggml.bin -
Note the trailing - on both commands, which instructs ffmpeg to write the wav file to stdout and instructs whisper to read from - on stdin.
— Reply to this email directly, view it on GitHub https://github.com/ggerganov/whisper.cpp/issues/800#issuecomment-1517967677, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADG3KGTKR3LLBJ6SMGPQI5TXCKOQPANCNFSM6AAAAAAXD6K3C4 . You are receiving this because you authored the thread.Message ID: @.***>
Sox is also very handy: sox input.wav -r 16000 -b 16 output.wav
Is there a way to feed audio files into a continuously waiting instance of stream? I've been using main on demand, but it's very slow compared to how stream works. Probablt because this means it has to load the model every time a new audio snipped is ready.
Stream is much better for fast 'on demand' work, except that it's only input option is the microphone, which in my case, is already occupied.
// It seems the server tool is useful for this use case: https://github.com/ggerganov/whisper.cpp/tree/master/examples/server
@slaren would you consider a PR optionaly linking the server/stream executables with libffmpeg/libsox in order to convert the input on the fly and in memory (vs triggering/running an external process) ? Best
That would be up to @ggerganov , but I see no issue with it as long as it is optional.
Yup, it could be a good addition
Good. @ggerganov Is libffmpeg the best option or would you prefer another lib (sox, ...) ? Refs: https://github.com/FFmpeg/FFmpeg https://johnvansickle.com/ffmpeg/
Good. @ggerganov Is libffmpeg the best option or would you prefer another lib (sox, ...) ? Refs: https://github.com/FFmpeg/FFmpeg https://johnvansickle.com/ffmpeg/
Why not make compile time options?