nodejs-whisper icon indicating copy to clipboard operation
nodejs-whisper copied to clipboard

Do not apply .wav to .vtt file

Open binarykitchen opened this issue 1 year ago • 4 comments

Once whisper has transcribed and generated the file, your code applies .wav to the subtitle file, which is a bug.

For example, videomail.wav.vtt when it should be just videomail.vtt

And it would be good if the output file can be configured with a new option.

Thanks

binarykitchen avatar Sep 27 '24 00:09 binarykitchen

How is this a bug? It just appends .vtt to whatever the source filename was. It is the default behavior of whisper.cpp, and this library does not change it. This script does has an additional feature that may be converting your input sound file into a .wav file if it is not already, which might be where your confusion comes from.

timkrins avatar Sep 29 '24 15:09 timkrins

This script does has an additional feature that may be converting your input sound file into a .wav file if it is not already, which might be where your confusion comes from.

I am not using it anymore, yet it always amends .vtt, no matter what your input is. filename.wav.vtt for example is confusing, it should just rename to filename.vtt without wav in it.

But when you say it's Whisper's default behaviour, should I raise this in the Whisper repository instead? If so, which one it is?

binarykitchen avatar Sep 29 '24 22:09 binarykitchen

@binarykitchen The underlying whisper.cpp project controls most of this behavior, but we have an option. We can add code to rename the generated file at the end of our wrapper. We could also add an extra option in this npm library for a custom output file name. This would allow for more customized output files. I can implement these changes if they're important to you.

ChetanXpro avatar Oct 06 '24 10:10 ChetanXpro

@ChetanXpro I've already fixed this in my project temporarily.

Adding more options to your library feels not right. We should avoid adding too many options. I think the problem should be raised within the "whisper.cpp" project. Do you have a link?

(because using two file extensions like abc.wav.vtt feels like an antipattern)

binarykitchen avatar Oct 06 '24 10:10 binarykitchen

Yeah I agree, you can raise this issue in whisper cpp project https://github.com/ggerganov/whisper.cpp

ChetanXpro avatar Dec 13 '24 14:12 ChetanXpro