Results 6 comments of J Seppänen

@maitchison OK I addressed all of your comments, plus added one extra change (used builtin `gym.wrappers.NormalizeObservation` instead of DIY) – I'm still going to re-run some of the experiments to...

OK so some news: I reverted the hyperparameter change (commit 0fe7b1f) and the learning stage ordering change (commit e50bf9e). The reason is that I did some investigation to what changes...

Update: I re-ran the experiments with commit 94fc331 and updated the documentation/results, and removed the main() function. @vwxyzjn github is telling me that "requested changes must be addressed" but I...

> One important difference seems that PPO-DNA consistently uses 128 parallel envs and PPO by default is using 8. I'm not aware if using higher makes too much of a...

@vwxyzjn sure, I added benchmarks/ppo_dna.sh. Also, maybe I haven't communicated my experiment results clearly enough, I could also try and consolidate them into one place.

I can live with the 2GB limit as such, but the problem is that I want to make sure my data doesn't get silently truncated if I at some point...