Hannan Komari
Hannan Komari
Thanks for your response @sanchit-gandhi I've tested your proposed approach, save flax model by converting to cpu and then restart kernel and load `FlaxWhisperForConditionalGeneration` by try disabling `_do_init`. But Inference...
I also found that whisper_small checkpoint is also taking ~33GB of GPU RAM!
> For my fine-tuned whisper-medium, if I don't run inside the `torch.no_grad()`, I get an error and it is just fixed by adding `torch.no_grad()`: ``` RuntimeError Traceback (most recent call...
> > Now Inference time is 80x larger! > > There shouldn't be any difference to inference time - are you certain you're running on GPU here? Make sure you...
> It could also be that we're recompiling each time - would be great to see your code here @hannan72 to verify! This is my full code: Firstly, PyTorch model...
I found that Flax model when set to use beam-search calculates the scores value: https://github.com/huggingface/transformers/blob/12d51db243a00726a548a43cc333390ebae731e3/src/transformers/generation/flax_utils.py#L83-L96 and in the _beam_search method it is calculated and returned: https://github.com/huggingface/transformers/blob/12d51db243a00726a548a43cc333390ebae731e3/src/transformers/generation/flax_utils.py#L998-L1004 but it doesn't return...
I run the flax whisper model in beam_search model by passing `generation_config.num_beams` to a value larger than 1. It returns `scores` at the output but it is totally different from...
I found logits of Flax in flax_utils.py as follows: https://github.com/huggingface/transformers/blob/ed67286465c5e9e3d3005de3e21bc3c679d93072/src/transformers/generation/flax_utils.py#L610-L618 Just need to extract this logits out of greed_search function and return it
I've added the support of `output_scores` to the flax_utils.py code in the followin fork: https://github.com/hannan72/transformers/commit/116d8f38722359ca5d2dad918975348359cc2ac1 And also add support of the following parameters to the Flax-Whisper model: https://github.com/hannan72/transformers/commit/accdcb2d66496c5ee8547739bf833c95e189344c @sanchit-gandhi Could...
I have made a PR about this feature: https://github.com/huggingface/transformers/pull/22700