蒲源
蒲源
add efficientzero policy and model, and related env and config, and migrate the existing alphazero demo ## Description ## Related Issue ## TODO ## Check List - [ ] merge...
Hi, First of all, thank you for opensourcing your nice code! I have a question regarding the effect of torch_amp: I test the training process of EfficientZero when using and...
## Description add output_activation, output_norm_type, last_linear_layer_init_zero option for MLP ## Related Issue ## TODO ## Check List - [ ] merge the latest version source branch/repo, and resolve all the...
## Description add modified gym-hybrid including moving, sliding and hardmove env ## Related Issue ## TODO ## Check List - [ ] merge the latest version source branch/repo, and resolve...
Thanks for you open-sourced code very much. I am very confused about this code segment in [backpropagate](https://github.com/werner-duvaud/muzero-general/blob/master/self_play.py#L406) method in self_play.py: when len(self.config.players) is 2, - in line [423](https://github.com/werner-duvaud/muzero-general/blob/master/self_play.py#L423): `min_max_stats.update(node.reward +...
Thanks for your open-sourced code very much. This is a common definition of an target value in classical RL: I'm a little confused about the way of calculating target value...
Thanks for your open-sourced code very much. I'm a little confused about the reason for the identity connection of state encoding in [DynamicsNetwork](https://github.com/YeWR/EfficientZero/blob/main/config/atari/model.py#L252) in model.py: Why do we add this...
Thanks for you open-sourced code very much. I am very confused about this code segment in [put_last_trajectory](https://github.com/YeWR/EfficientZero/blob/main/core/selfplay_worker.py#L69) method in selfplay_worker.py: In [Line 69](https://github.com/YeWR/EfficientZero/blob/main/core/selfplay_worker.py#L69) , why is, ` pad_child_visits_lst = game_histories[i].child_visits[beg_index:end_index]`...
- Our work is currently focused on developing a unified and scalable planning framework. - Our code is partially based on https://github.com/eloialonso/iris.
- add go_env, related unittest - add go mcts bot and alphazero/muzero config - add league version of alphazero - add ctree version of alphazero