Medusa icon indicating copy to clipboard operation
Medusa copied to clipboard

Medusa: Simple Framework for Accelerating LLM Generation with Multiple Decoding Heads

Results 58 Medusa issues
Sort by recently updated
recently updated
newest added

Changed the TGI link pointing to medusa that is broken

The repo contains code and examples for tuning medusa heads for text-only LLMs. Is the code for Medusa(-2) directly compatible with VLMs as well? I assume that Medusa should be...

Suppose the first MEDUSA head generates the top-2 predictions "It is" and "It's", while the second MEDUSA head generates the top-3 predictions "difficult", "a", and "not". This results in a...

``` class MedusaModelABC(nn.Module): """The Medusa Language Model Head. This module creates a series of prediction heads (based on the 'medusa' parameter) on top of a given base model. Each head...

In the `README.md`, you mentioned that > The data preparation code for self-distillation can be found in [data_generation folder](https://github.com/FasterDecoding/Medusa/blob/main/data_generation) of the current repo. In that folder, it says > `python...

In medusa_model_legacy.py, the implementation is that the Medusa head is only responsible for generating new hidden states, and the generation of medusa logits still reuses the base_model's lm_head. Here is...

HI when I was training with vicuna v1.3, the loss is always nan, my training script is this `torchrun --nproc_per_node=1 medusa/train/train_legacy.py --model_name_or_path lmsys/vicuna-7b-v1.3 \ --data_path mistral.json \ --bf16 True \...