Vladislav Sokolovskii
Vladislav Sokolovskii
### Feature request Currently, the [Wav2Vec2ProcessorWithLM](https://github.com/huggingface/transformers/blob/ae54e3c3b18bac0832ad62ea9b896dfd52a09850/src/transformers/models/wav2vec2_with_lm/processing_wav2vec2_with_lm.py#L67) decode function returns [only the best hypothesis](https://github.com/huggingface/transformers/blob/ae54e3c3b18bac0832ad62ea9b896dfd52a09850/src/transformers/models/wav2vec2_with_lm/processing_wav2vec2_with_lm.py#L572). Shall we extend its functionality and make it return n-best hypotheses, logit_scores, lm_scores, word_offsets so that people...
# What does this PR do? Fixes #22150 , now the user can specify the number of hypotheses which will be returned after the decoding stage. If the specified number...
## Description - Limit is not configurable through the memory config - You can now set the custom similarity threshold to filter out irrelevant memories. Fixes # (issue) ## Type...
Small update to the model name so that it corresponds to the `served-model-name` from the vllm serve command.