Rémy Portelas

Results 2 issues of Rémy Portelas

I am currently experimenting on scaling EfficientZero to learning setups with high-data regimes. As a first step, I am running experiments on Atari, with a replay buffer of 1M environment...

I might have found an unexpected behavior in how parallel training environments are being seeded. I am referring to this line: https://github.com/YeWR/EfficientZero/blob/c533ebf5481be624d896c19f499ed4b2f7d7440d/core/selfplay_worker.py#L112 Because the rank of the first selfplay worker...