BigLoong
BigLoong
麻烦问一下train_retriever.py文件中第44行求loss的函数中,cross_entropy的训练target为什么是是torch.arange(0, len(l_pos)呀? 
Error Info: File "/data/rooter_use/conda/envs/llama-env39/lib/python3.9/site-packages/deepspeed/runtime/hybrid_engine.py", line 398, in step actor_loss, critic_loss = trainer.train_rlhf(exp_data) File "/data/rooter_use/DeepSpeedExamples/applications/DeepSpeed-Chat/training/step3_rlhf_finetuning/ppo_trainer.py", line 173, in train_rlhf actor_loss, critic_loss = trainer.train_rlhf(exp_data) if(self._inference_containers[0].module.attention.attn_qkvw is not None and \ File "/data/rooter_use/DeepSpeedExamples/applications/DeepSpeed-Chat/training/step3_rlhf_finetuning/ppo_trainer.py",...