OfirArviv comments

Results 9 comments of


                                            OfirArviv

Diverging evaluation loss using finetuning scripts Guanaco 7b

I have the same issue, with flan-t5-xxl, ul2 an xglm. But I ran the same code without 4bit and just LORA, and the model converged normally. So it is the...

Evaluation loss issue (strange trend)

I have the same problem. I fine-tuned mt5-xxl, ul2 and xglm-7.5 model on 2 datasets, and the model manage to "learn" for a good amount of steps, but usually after...

Add llm as judge mt-bench dataset and metrics

@yoavkatz @elronbandel when this is read please go over thoroughly. There were a lot of changes, so please make sure I didn't forget anything. These are the original Mt-Bench prompts...

Add llm as judge mt-bench dataset and metrics

> Regarding naming, I think the right naming for the tasks are: `evaluation.response_rating` (single turn) `evaluation.response_rating.with_reference` `evaluation.response_selection` or `evaluation.response_preference` etc > > also in the task fields i would change...

Add llm as judge mt-bench dataset and metrics

> > Regarding naming, I think the right naming for the tasks are: `evaluation.response_rating` (single turn) `evaluation.response_rating.with_reference` `evaluation.response_selection` or `evaluation.response_preference` etc > > also in the task fields i would...

OfirArviv

Diverging evaluation loss using finetuning scripts Guanaco 7b

Evaluation loss issue (strange trend)

Add llm as judge mt-bench dataset and metrics

Add llm as judge mt-bench dataset and metrics

Add llm as judge mt-bench dataset and metrics

Add llm as judge mt-bench dataset and metrics

Add llm as judge mt-bench dataset and metrics

A replacement is needed for dataset lmsys/arena-hard-browser that was gone from HF

Failed to download MMBench_V11.tsv error