sssshhhhhh comments

Results 6 comments of


                                            sssshhhhhh

Numerical discrepancy in reduced precision operations causes WER degradation for custom models

This is just how BLAS libs are, they take shortcuts and treat fp ops as associative. I ran your torch equivalent on my cpu and actually got the same 5...

Numerical discrepancy in reduced precision operations causes WER degradation for custom models

I don't think it's that bad considering bf16 only has 7 bits of mantissa. In the range [0.015625, 0.03125) 1e-4 is the spacing between values. Even within torch with bf16...

Numerical discrepancy in reduced precision operations causes WER degradation for custom models

An example where tokens are completely different might help. I tested a random finetune (jlvdoorn/whisper-large-v3-atco2-asr) and had no problems with it either. |fp16 | atco2 |large-v3 | |-----------|-------------------------|--------------------| |openai |...

Numerical discrepancy in reduced precision operations causes WER degradation for custom models

You have temperature fallback enabled in faster-whisper which is inherent rng. Also turn off condition_on_previous_text since that amplifies deviations especially if you aren't training with prompts. Since you're doing greedy...

Gemma 3n support?

Yes text only tho

Segmentation Fault with Distilled Models on CPU when word_timestamps=True

transformers word timestamps is return_timestamps='word' not True. This model also errors in hf and openai because the distil process removes layers which makes alignment heads refer to heads in nonexistant...