ReLLIE
ReLLIE copied to clipboard
Reward did not converge
Hello, author. As shown in the figure, when I tried to reproduce your work, I encountered the issue that the reward did not converge and the final metrics were quite different from those reported in the paper. Could you please let me know what might be causing this? Is the reward setting effective? And is the code in this repository completely correct?
the reward curve: