IQL-PyTorch icon indicating copy to clipboard operation
IQL-PyTorch copied to clipboard

A PyTorch implementation of Implicit Q-Learning

Results 4 IQL-PyTorch issues
Sort by recently updated
recently updated
newest added

In the function return_range, the end of trajectory is marked by either the terminal signal or the time steps equals to max_episode_steps. However, as the dataset is extracted from D4RL's...

![image](https://user-images.githubusercontent.com/8472226/177190425-1f871edb-0456-41b0-8522-b5d6b45b1cf6.png) i was able to get close to/better than official results (i also made the cosine damped learning rate to work in 5000 steps)

Thanks for your work. Could your code behave well in Antmaze environments?