Why trained policy is not as good as yours

Open gwhan98 opened this issue 5 years ago • 1 comments

Hi, I followed all your steps and trained the policy from scratch for stage 1.

I am not able to get a policy as good as yours (still always crashes) even after training for 12 hours.

May I ask if you used anything special to train the policy? I have tried many times but cannot get a good policy, and starting from scratch seems very bad.

Jan 27 '21 19:01 gwhan98

It's hard to say. You may train a longer time to see the performance. I have used three machines for distributed training, for your information.

Jan 28 '21 02:01 Acmece