IQL results different with the paper
Hi,
The IQL results from this repo seem to differ from the original paper.
According to the README of the IQL example code here, IQL scores an average raw return of about 1500 on hopper-medium-expert with offline training:
https://github.com/rail-berkeley/rlkit/tree/master/examples/iql

However, the original paper notes that IQL scores 91.5 in normalized average return (which is about 2950 in raw return):
https://arxiv.org/pdf/2110.06169.pdf

Can you take a look at this and check what is causing the difference?
Thank you!
Sorry for the late response, but the IQL experiments in the paper were run in jax and should be reproducible with the other repo: https://github.com/ikostrikov/implicit_q_learning
This reimplementation is pytorch is for convenience and likely has minor initialization differences etc.