Medusa icon indicating copy to clipboard operation
Medusa copied to clipboard

Medusa: Simple Framework for Accelerating LLM Generation with Multiple Decoding Heads

Results 58 Medusa issues
Sort by recently updated
recently updated
newest added

Hi All, Thank you for your awesome work! Is it possible to integrate Medusa into [Whisper's Decoder](https://huggingface.co/openai/whisper-large-v2) enhancing the decoding speed? Do you have any plans supporting Whisper? Thanks in...

@ctlllll Please provide Dockerfile for Medusa . Lots of error resolving while doing the setup. It would be good to have containerized environment which supports training and inference both. Its...

I checked. They are alternate File "/root/Medusa/medusa/train/train_legacy.py", line 183, in preprocess prompt = tokenizer.apply_chat_template(conversation, tokenize=False) File "/root/miniconda3/envs/fschat/lib/python3.9/site-packages/transformers/tokenization_utils_base.py", line 1743, in apply_chat_template rendered = compiled_template.render( File "/root/miniconda3/envs/fschat/lib/python3.9/site-packages/jinja2/environment.py", line 1304, in render...

How to use the finetuned mistal model for inference with Medusa

When utilizing Axolotl, the training loss reduces to 0 following the gradient accumulation steps. Is this expected behaviour? With Torchrun, the training loss consistently remains NaN. Thanks for the help!!...

There is a slight bug in the preprocess function, causing no difference between the content of targets and input_ids. Relevant modifications are proposed here: https://github.com/FasterDecoding/Medusa/pull/83#discussion_r1582080343

Is Medusa1 model generalize token-wise the same as the base model w.o. medusa head? I found change medusa choices will change the output.

I got a error when I refer to https://github.com/FasterDecoding/Medusa to prepare to run the Demo . 1. The basic environment was successfully installed without any errors. ``` git clone https://github.com/FasterDecoding/Medusa.git...

Hi there, Thank you for the great work! I have some problem. In the Google colab environment ``` !git clone https://github.com/FasterDecoding/Medusa.git %cd Medusa !pip install -e . !python -m medusa.inference.cli...

I just tested gen_model_answer_baseline.py and gen_model_answer_medusa.py. Medusa can be generated normally, but there are some problems with gen_model_answer_baseline.py. Can you run this py file normally?