Notebook '04' showing no Training Effect
Hi Leandro,
first of all many thanks for the amazing work on the library. I've found your documentation very easy to get into - especially paired with your talk at the Reinforcement Learning Meetup in Zurich.
I was toying around with your notebooks and noticed that notebook '04' was not showing any training effect for me. Please see the WANDB-Outputs HERE for reference - this is the result of simply executing the notebook with the provided training parameters, with the exception of a fix to a Key Error I explained in Git Issue #37. I toyed around a bit with learning rates, batch sizes etc. but could not clearly identify a learning effect.
Could you provide guidance on how I can address this issue? Was the fix for Git Issue #37 inappropriate?
Thanks in advance and best, Philip