IQL-PyTorch issues

Results 4 IQL-PyTorch issues

Sort by recently updated

Small bug in the function return_range

In the function return_range, the end of trajectory is marked by either the terminal signal or the time steps equals to max_episode_steps. However, as the dataset is extracted from D4RL's...

yaoliucs

GaussianPolicy output should use a tanh() activation?

![image](https://user-images.githubusercontent.com/8472226/177190425-1f871edb-0456-41b0-8522-b5d6b45b1cf6.png) i was able to get close to/better than official results (i also made the cosine damped learning rate to work in 5000 steps)

endseeker