lichao
lichao
**Describe the bug** when script auto run into step2 with error: exits with return code = -9 Traceback (most recent call last): File "/home/kidd/projects/llms/DeepSpeed/DeepSpeedExamples/applications/DeepSpeed-Chat/train.py", line 210, in main(args) File "/home/kidd/projects/llms/DeepSpeed/DeepSpeedExamples/applications/DeepSpeed-Chat/train.py",...
RuntimeError: Error(s) in loading state_dict for PeftModelForCausalLM: size mismatch for base_model.model.transformer.layers.0.attention.query_key_value.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([8, 4096]). size mismatch...
### 🐛 Describe the bug tried to run train_sft.sh with error: OOM orch.cuda.OutOfMemoryError: C**UDA out of memory. Tried to allocate 1**72.00 MiB (GPU 0; 23.68 GiB total capacity; 18.08 GiB...
**Describe the bug** runing step2 with script: deepspeed DeepSpeedExamples/applications/DeepSpeed-Chat/training/step2_reward_model_finetuning/main.py \ --data_split 2,4,4 \ --model_name_or_path facebook/opt-350m \ --num_padding_at_beginning 1 \ --per_device_train_batch_size 8 \ --per_device_eval_batch_size 8 \ --max_seq_len 512 \ --learning_rate 5e-5...
run step3 with: deepspeed --master_port 12346 DeepSpeedExamples/applications/DeepSpeed-Chat/training/step3_rlhf_finetuning/main.py \ --data_path wangrui6/Zhihu-KOL \ --data_split 2,4,4 \ --actor_model_name_or_path /home/kidd/projects/llms/pretrain_models/ChatGLM-6B/ \ --critic_model_name_or_path /home/kidd/projects/llms/path_to_rm_checkpoint/ \ --num_padding_at_beginning 1 \ --per_device_train_batch_size 4 \ --per_device_mini_train_batch_size 4 \ --generation_batch_numbers...
the chat example seems base on Facebook models,I mean,if I worked on those default steps to build own models is that allowed commercial use? What if I want to train...
fixed with image path problem in docker.md file
from baiduspider import BaiduSpider from pprint import pprint import sys # 实例化BaiduSpider spider = BaiduSpider() or_qe="香港2023年5月14日发生一起抢劫案,劫匪有几人?" baidu_re=spider.search_web(input(or_qe)) pprint(baidu_re) pprint(spider.search_web(query='Python')) ------------ pprint出来结果是: 如果加上plain出来是空的[]list