misutoneko

Results 33 comments of misutoneko

Upgrade, if you can...yeah I know, that's what they always say :D I suspect your error message is a result of the code being run by python2 and not python3...

Seems fine here with Mint 20.3 and python3 (should be pretty much the same as Ubuntu 20). A couple of lines needed a patch though: [unstrip_python3_patch.diff.gz](https://github.com/pzread/unstrip/files/8495413/unstrip_python3_patch.diff.gz)

One thing I think is worth mentioning here if using the latter method: The patching stage is bypassed, so it may be necessary to apply the MXE patches (such as...

Well one interesting idea I've seen would be to add an ["IPA-language"](https://github.com/openai/whisper/discussions/318) That way you'd get a approximate representation of how the spoken words sound like, and can then determine...

This seems to be dependent on the language, I see a similar effect with -l fi and several others. My understanding is that the problem originates from the training data...

Yes, this would be much appreciated, I'm not sure how much can be done without retraining the model(s) though. I suppose you are using the large model? I've found the...

I guess it's caused by ASLR because adding -no-pie to CFLAGS seems to fix it. The printout is a bit screwy, sgerwk's fork has some fixes for that.

[whisper.cpp_limit_language_autodetection_patch.diff.gz](https://github.com/ggerganov/whisper.cpp/files/12539830/whisper.cpp_limit_language_autodetection_patch.diff.gz) Here's a little patch you can try. This will extend the "auto" parameter in the main example so that you can give it a list of allowed languages. So...

There's the --duration (or -d) switch. It doesn't eat bytes but milliseconds though, so if you need to use the file size it will be necessary to do some calculation...

Have you tried with, say, 4 threads if it still behaves the same? I think I saw a graph somewhere that you don't get that much benefit from extra threads...