Joey Hejna
Joey Hejna
@SiyuanLee I am also trying to reproduce the results given in the paper. I directly ran the command from the README for AntBandits. After over 1400 iterations it hasn't converged...
All transfers in navigation environments using the discriminator were done using the point mass. Thus, the point mass row contains the correct hyperparameters. As mentioned in the text, we use...
1. Yeah, that should be the correct one. 2. In `compose_params` there is a line that disables sampling goals for the maze. This is the difference between the Maze and...
Hi! Thanks for your interest in our work! We have not tried XQL on the Mujoco benchmark, only on DM Control. The reward structure of these environments is very different....