Medusa issues

train medusa stage-2

1

how can i launch stage-2 training without using axolotl? can i just comment these lines in the [train_code](https://github.com/FasterDecoding/Medusa/blob/main/medusa/train/train_legacy.py) and save the whole model? https://github.com/FasterDecoding/Medusa/blob/5e980538695096e7e372c1e27a6bcf142bfeab11/medusa/train/train_legacy.py#L346-L348

smartliuhw

mistral.json

After starting Docker, running "create_data.py" the file will result in an error and mistral.json cannot be obtained. Alternatively, can you provide mistral.json directly? I looking forward to your early reply,...

Git-L1

which dataset should i use when training medusa heads with llama2 7b

I want to train a llama2 7b model, which is finedtuned by my own dataset. I don't know which dataset should i use while training the medusa heads? The dataset...

tu2022

Training Medusa heads

6

I am trying to train Medusa heads (first on the dataset provided as example, than on my own, much smaller dataset). I am working on Azure Compute Instance where I...

mmilunovic-mdcs

Why medusa-2 train llama2 with no such great improvement?

2

In the given examples axoltol [exmaples/medusa](https://github.com/ctlllll/axolotl/tree/main/examples/medusa), I follow the `vicuna_7b_qlora_stage1.yml` and `vicuna_7b_qlora_stage2.yml` to write my llama2 trainning config. Howerver I did't get such greate performance improvement, below is my test...

MeJerry215

Cant it support chatgllm?

just like the title,how to support chatglm3-6b? And I think the Medusa just product a new model base chatglm3-6b with add new heads?

PeterXiaTian

HYDRA support?

Hi, Thanks on this great work. Is there any plan to support HYDRA model which builds on medusa (https://arxiv.org/pdf/2402.05109.pdf)

arunpatala

Misleading Name LLM Name MEDUSA

There is lot of misinformation and there is Too much to read, This has been the case since 1960s when the first Punch-HOLE base programming language card were introduce and...

Pittconnect

Support batch size > 1

### Support BatchSize > 1 This PR suppose to support batch size > 1 for Medusa inference model. This is only a draft for now and need further improvement. ###...

xwang365

[Dynamic Batching] Concerns about whether features are not supported using Medusa

I checked the TRT-LLM but found something confusing. There are some features not supported: 1. inferece batch size == 1, (seemed solved recently) 2. not surport in-flight batching, which will...

Ageliss

Medusa
Medusa copied to clipboard

Metadata

train medusa stage-2

mistral.json

which dataset should i use when training medusa heads with llama2 7b

Training Medusa heads

Why medusa-2 train llama2 with no such great improvement?

Cant it support chatgllm?

HYDRA support?

Misleading Name LLM Name MEDUSA

Support batch size > 1

[Dynamic Batching] Concerns about whether features are not supported using Medusa

← Metadata

Owner

Metadata

Medusa Medusa copied to clipboard

Metadata

← Metadata

Owner

Metadata

Medusa
Medusa copied to clipboard