IQL results different with the paper

Open dssrgu opened this issue 3 years ago • 1 comments

Hi,

The IQL results from this repo seem to differ from the original paper.

According to the README of the IQL example code here, IQL scores an average raw return of about 1500 on hopper-medium-expert with offline training: https://github.com/rail-berkeley/rlkit/tree/master/examples/iql

However, the original paper notes that IQL scores 91.5 in normalized average return (which is about 2950 in raw return): https://arxiv.org/pdf/2110.06169.pdf

Can you take a look at this and check what is causing the difference?

Thank you!

Jan 20 '23 04:01 dssrgu

Sorry for the late response, but the IQL experiments in the paper were run in jax and should be reproducible with the other repo: https://github.com/ikostrikov/implicit_q_learning

This reimplementation is pytorch is for convenience and likely has minor initialization differences etc.

Jun 17 '24 17:06 anair13