IverYangg comments

Results 3 comments of


                                            IverYangg

about training use gpu or cpu

i wonder if this flag is useful? i tried change this flag this True, but i can't see any change in speed. this line https://github.com/nicrusso7/rex-gym/blob/82dea26bcd8896da06240bcc3abd4de5b4696430/rex_gym/agents/scripts/configs.py#L27

ValueError: device value error, must be str,

> > 请问一下，我只在仿真环境中跑了train.py, 按照readme的指导，但是报告了一个异常， ValueError: device value error, must be str, paddle.CPUPlace(), paddle.CUDAPlace(), paddle.CUDAPinnedPlace() or paddle.XPUPlace(), but the type of device is device 我测试了一下，device=cuda, type(decvice) = , 一直找不到哪里的问题,可以帮忙看一下么？ > >...

关于PPO算法计算gae的疑问

还有一个问题想请教一下，就是在每一个episode结束，有以下三种情况，第一是在没有达到一个episode规定的最大步数的时候，因为任务失败了，没能继续进行下去，因此，reset环境，第二种情况是成功的完成了任务，也没有达到episode规定的最大步数，第三种情况，如果作为mujoco的hopper环境来讲，任务没有所谓成功而结束，只是步数达到了规定的最大步数，这个时候最后一个step的时候，获得了新的观测值和基于这个新的环境的一个value,这中情况下，计算advantage是否要区别与前两种情况？