weiddeng comments

Results 6 comments of


                                            weiddeng

Anyone fine tuned successfully on 1 or 2 GPUs?

Hi ashkan-leo, can you share your command? Would like to see our diffs. Out of memory errors is better than the one I got. Thanks! ________________________________ From: Ashkan ***@***.***> Sent:...

Anyone fine tuned successfully on 1 or 2 GPUs?

The mistake I made was using `--LlamaDecoderLayer 'LLaMADecoderLayer'` should use `--fsdp_transformer_layer_cls_to_wrap 'LlamaDecoderLayer'` However I did get OutOfMemoryError. With `--nproc_per_node=1` torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 774.00 MiB (GPU...

After finetuning in a different dataset, output is wrong

Hi @felri in your instruction fine tune data, I see each instruction is prefixed by "In Stardew Valley, " What is the rationale? Have you tried without the prefix? Thanks!

Vicuna 13b does not stop generating

"Since the special_token_map.json is auto-generated" - interesting, it was not generated in my case. But you are right on that I used this leaked version of llama-13b :sweat_smile:

Multi-gpu demo failed on two A6000

Same issue experienced.

What is micro_batch_size?

gradient_accumulation_steps = batch_size // micro_batch_size