population-irl icon indicating copy to clipboard operation
population-irl copied to clipboard

(Experimental) Inverse reinforcement learning from trajectories generated by multiple agents with different (but correlated) rewards

Results 7 population-irl issues
Sort by recently updated
recently updated
newest added

It currently takes around 2 seconds to load the pirl module. Most of the time this isn't a big deal, but due to a combination of: 1. We restart each...

enhancement
good first issue

New requirements: 1. Make the lava not be giant columns 2. Create a parameter for different types of tiles with reward distributions (e.g. "Water" in the range "-1 to -5",...

Should be based on this: https://github.com/openai/gym/blob/master/gym/envs/mujoco/ant.py I think we probably just need to change the reward function part of that code. So either just copying that, or overwriting the step...

It's pretty slow right now, I suspect the value iteration is slow. Good to do some profiling to pin it down. Possible that moving things to GPU would speed things...

enhancement