rl-collision-avoidance
rl-collision-avoidance copied to clipboard
Why trained policy is not as good as yours
Hi, I followed all your steps and trained the policy from scratch for stage 1.
I am not able to get a policy as good as yours (still always crashes) even after training for 12 hours.
May I ask if you used anything special to train the policy? I have tried many times but cannot get a good policy, and starting from scratch seems very bad.
It's hard to say. You may train a longer time to see the performance. I have used three machines for distributed training, for your information.