Haixia
Haixia
运行到这里,一直卡着不往下进行下去了。我怀疑是optimizer的问题
the same puzzle!! when i set the "lora_rank" as "0", i can run successfully. But the the saved model file is very big, it is about 13GB!!!
> When you set lora_rank to 0, you are training the model without lora training, which is described here: yes, i know that when i set "lora_rank" to 0 means...
> > When you set lora_rank to 0, you are training the model without lora training, which is described here: > > yes, i know that when i set "lora_rank"...
> @Camille7777 Hello, I found that the code you submitted did not solve the problem of using lora training to save model parameters. I found that after the training, the...
me too!! why! when i run code on 7 or 8 gpu, it runs the same error as you! but when i run on 6 gpu, it successes! i am...
Meet the same issue: AttributeError: 'LlamaRM' object has no attribute 'resize_token_embeddings' WARNING:torch.distributed.elastic.multiprocessing.api:Sending process 526441 closing signal SIGTERM ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 0 (pid: 526440) of binary: /data/anaconda3/bin/python Traceback (most recent...
> I met this error too, but if you are training the stage 2, you should modified the pretrain to Coati7B(you trained in stage 1) instead of LLaMA7B that provided...
请问这个问题你解决了嘛? 同样好奇如何构造prompt!