Medusa
Medusa copied to clipboard
Medusa: Simple Framework for Accelerating LLM Generation with Multiple Decoding Heads
how can i launch stage-2 training without using axolotl? can i just comment these lines in the [train_code](https://github.com/FasterDecoding/Medusa/blob/main/medusa/train/train_legacy.py) and save the whole model? https://github.com/FasterDecoding/Medusa/blob/5e980538695096e7e372c1e27a6bcf142bfeab11/medusa/train/train_legacy.py#L346-L348
After starting Docker, running "create_data.py" the file will result in an error and mistral.json cannot be obtained. Alternatively, can you provide mistral.json directly? I looking forward to your early reply,...
I want to train a llama2 7b model, which is finedtuned by my own dataset. I don't know which dataset should i use while training the medusa heads? The dataset...
I am trying to train Medusa heads (first on the dataset provided as example, than on my own, much smaller dataset). I am working on Azure Compute Instance where I...
In the given examples axoltol [exmaples/medusa](https://github.com/ctlllll/axolotl/tree/main/examples/medusa), I follow the `vicuna_7b_qlora_stage1.yml` and `vicuna_7b_qlora_stage2.yml` to write my llama2 trainning config. Howerver I did't get such greate performance improvement, below is my test...
just like the title,how to support chatglm3-6b? And I think the Medusa just product a new model base chatglm3-6b with add new heads?
Hi, Thanks on this great work. Is there any plan to support HYDRA model which builds on medusa (https://arxiv.org/pdf/2402.05109.pdf)
There is lot of misinformation and there is Too much to read, This has been the case since 1960s when the first Punch-HOLE base programming language card were introduce and...
### Support BatchSize > 1 This PR suppose to support batch size > 1 for Medusa inference model. This is only a draft for now and need further improvement. ###...
I checked the TRT-LLM but found something confusing. There are some features not supported: 1. inferece batch size == 1, (seemed solved recently) 2. not surport in-flight batching, which will...