sssshhhhhh

Results 6 comments of sssshhhhhh

This is just how BLAS libs are, they take shortcuts and treat fp ops as associative. I ran your torch equivalent on my cpu and actually got the same 5...

I don't think it's that bad considering bf16 only has 7 bits of mantissa. In the range [0.015625, 0.03125) 1e-4 is the spacing between values. Even within torch with bf16...

An example where tokens are completely different might help. I tested a random finetune (jlvdoorn/whisper-large-v3-atco2-asr) and had no problems with it either. |fp16 | atco2 |large-v3 | |-----------|-------------------------|--------------------| |openai |...

You have temperature fallback enabled in faster-whisper which is inherent rng. Also turn off condition_on_previous_text since that amplifies deviations especially if you aren't training with prompts. Since you're doing greedy...

Yes text only tho

transformers word timestamps is return_timestamps='word' not True. This model also errors in hf and openai because the distil process removes layers which makes alignment heads refer to heads in nonexistant...