PufferLib
PufferLib copied to clipboard
Simplifying reinforcement learning for complex game environments
Hi! Is there an example script to train a baseline PPO agent using CleanRL on Nethack? Ty!
PufferLib customized with changes to support frame_stack=4 pokegym, updated counts_map, folder logic
Several changes including working counts_map, modifiable to report to wandb less-frequently, some speed-related (debatable) changes, stats reporting put in different loop, config.yaml updated with best settings, etc.
You probably dont need to dispatch to numpy everytime you call `split` to calculate the number of elements in the space. This PR caches the sizes (in a less than...
Should hopefully be faster. Based on my [comparison of different categorical distribution sampling methods](https://gist.github.com/thatguy11325/4df3b4d39e9b707b5ee0e09a7489769c). [Wandb tests](https://wandb.ai/thatguy11325/pufferlib/groups/puf-0.7.0-baseline/workspace?workspace=user-thatguy11325)
Attempts to access pufferlib.environments.atari.make_env
windows wsl The pip install bufferlib error is as follows: Using cached pufferlib-0.4.0.tar.gz (92 kB) Preparing metadata (setup.py) ... error error: subprocess-exited-with-error × python setup.py egg_info did not run successfully....
pip install pufferlib Collecting pufferlib Using cached pufferlib-0.4.5.tar.gz (94 kB) Installing build dependencies ... done Getting requirements to build wheel ... done Preparing metadata (pyproject.toml) ... done Collecting gym==0.23 (from...
Found an issue with the `OpenSkillRating` which causes Wandb logging to fail in the policy ranker. Problem is here https://github.com/PufferAI/PufferLib/blob/889f172cb27819f771681c91c9b51f8f1e132a17/pufferlib/policy_ranker.py#L90 ``` Exception has occurred: TypeError (note: full exception trace is...
1. Clean unused seed value in environment 2. Add test case for reset and step 3. Change session folder to sub directory Feedback welcome on the test case, was not...