Slow running times
I am experiencing slow running times on a Linux machine with AVX support, more than twise than what I get with the original OpenAI implementation.
For example, when transcribing an English file using the Small model in OpenAI Whisper I get a running time factor of 0.52 with a single thread and 0.18 with 4 thread. With the Medium model I get a factor of 1.47 with a single thread, and 0.48 with 4 threads.
With whisper.cpp with the small model I get a factor of 1.27 with a single thread and 0.35 with 4 thread. With the Medium model I got a factor of 3.9 with a single thread, and 1.1 with 4 threads. All reasults are obtained with greedy search. With beam search the situation is even worse....
Am I doing something wrong?
Did you build with CUDA enabled?