Zhanhui Zhou
Zhanhui Zhou
## Description [Implicit Behavior Cloning](https://arxiv.org/pdf/2109.00137.pdf) [Experiment results](https://vsde0sjona.feishu.cn/wiki/wikcnG5I5iux7PA2Jr44eDAfoZf) ## Related Issue ## TODO ## Check List - [x] minimal working pipeline - [x] vanilla energy based model - [x] autoregressive energy...
## Description 1. modify IMPALA to handle continuous action space 2. create QIMPALA which combines IMPALA and SAC ## Related Issue ## TODO ## Check List - [ ] merge...
1. Change how we transform a distribution. For example, https://github.com/opendilab/DI-engine/blob/main/ding/policy/sac.py#L816-L822, can be changed to ``` dist = TransformedDistribution(Independent(Normal(mu, sigma), 1), [TanhTransform()]) next_action = dist.rsample() next_log_prob = dist.log_prob() ``` This is...
Thanks for your great contribution to this reimplementation. Is there any plan to test this implementation on MineRL? Also is there any plan to release the results of *crafter* and...