FastChat icon indicating copy to clipboard operation
FastChat copied to clipboard

Get Error:use_cache is not supported, when finetune

Open zhangsanfeng86 opened this issue 2 years ago • 9 comments

image

zhangsanfeng86 avatar Apr 04 '23 19:04 zhangsanfeng86

@zhangsanfeng86 why would you need to use cache for fine-tuning? It is only useful for decoding.

zhisbug avatar Apr 04 '23 20:04 zhisbug

@zhisbug How to set "use cache=False"? I just run "train_mem.py"

zhangsanfeng86 avatar Apr 04 '23 20:04 zhangsanfeng86

Did you turn on the gradient_checkpointing? We need to turn on that for the train_mem.py as that will turn off the use_cache implicitly.

Michaelvll avatar Apr 04 '23 23:04 Michaelvll

Thank u, I'll try!

zhangsanfeng86 avatar Apr 05 '23 01:04 zhangsanfeng86

Did you turn on the gradient_checkpointing? We need to turn on that for the train_mem.py as that will turn off the use_cache implicitly.


Hi @Michaelvll I turn on 'gradient_checkpointing', get this warning, when finetune.

sLLKe71xYr

zhangsanfeng86 avatar Apr 06 '23 03:04 zhangsanfeng86

Hi, we have released a new training script and a new version of weights (https://github.com/lm-sys/FastChat/blob/main/docs/weights_version.md). You can follow https://github.com/lm-sys/FastChat#fine-tuning-vicuna-7b-with-local-gpus, which only uses 4 X A100 (40GB).

The error you mentioned can be safely ignored. Could you try the latest training script again?

merrymercy avatar Apr 12 '23 21:04 merrymercy

special_tokens_map.json tokenizer.model tokenizer_config.json

@merrymercy These 3 files seem to be missing from the hugging face repo of version 1.1 Can you please add them as apply_delta code is showing up errors of missing tokenizer files.

Thank you.

samarthsarin avatar Apr 13 '23 13:04 samarthsarin

@samarthsarin Could you uninstall fschat and reinstall the latest fschat? The tokenizer files are omitted on purpose because we didn't change the tokenizer. apply_delta.py will just copy LLaMA's tokenizer

merrymercy avatar Apr 13 '23 18:04 merrymercy

I was not aware that apply_delta.py is also modified in the latest push. Now its working fine. Thank you.

samarthsarin avatar Apr 13 '23 18:04 samarthsarin

Seems like the issue is resolved.

zhisbug avatar Apr 20 '23 23:04 zhisbug