PufferLib icon indicating copy to clipboard operation
PufferLib copied to clipboard

Simplifying reinforcement learning for complex game environments

Results 111 PufferLib issues
Sort by recently updated
recently updated
newest added

Hi! Is there an example script to train a baseline PPO agent using CleanRL on Nethack? Ty!

Several changes including working counts_map, modifiable to report to wandb less-frequently, some speed-related (debatable) changes, stats reporting put in different loop, config.yaml updated with best settings, etc.

You probably dont need to dispatch to numpy everytime you call `split` to calculate the number of elements in the space. This PR caches the sizes (in a less than...

Should hopefully be faster. Based on my [comparison of different categorical distribution sampling methods](https://gist.github.com/thatguy11325/4df3b4d39e9b707b5ee0e09a7489769c). [Wandb tests](https://wandb.ai/thatguy11325/pufferlib/groups/puf-0.7.0-baseline/workspace?workspace=user-thatguy11325)

Attempts to access pufferlib.environments.atari.make_env

windows wsl The pip install bufferlib error is as follows: Using cached pufferlib-0.4.0.tar.gz (92 kB) Preparing metadata (setup.py) ... error error: subprocess-exited-with-error × python setup.py egg_info did not run successfully....

pip install pufferlib Collecting pufferlib Using cached pufferlib-0.4.5.tar.gz (94 kB) Installing build dependencies ... done Getting requirements to build wheel ... done Preparing metadata (pyproject.toml) ... done Collecting gym==0.23 (from...

Found an issue with the `OpenSkillRating` which causes Wandb logging to fail in the policy ranker. Problem is here https://github.com/PufferAI/PufferLib/blob/889f172cb27819f771681c91c9b51f8f1e132a17/pufferlib/policy_ranker.py#L90 ``` Exception has occurred: TypeError (note: full exception trace is...

1. Clean unused seed value in environment 2. Add test case for reset and step 3. Change session folder to sub directory Feedback welcome on the test case, was not...