Approximate a finite horizon environment?

Open kierad opened this issue 2 years ago • 0 comments

Hello 👋 This is a question, not a feature request - I hope that's alright.

I understand that this repo doesn't support infinite horizon episodes. The gridworld environment I want to use technically has infinite horizon, but in practice episodes are always finite and of variable length (all positive reward states are either 'one-time only', such as a coin that can only be collected once, or they are terminal states. Will I see valid results if I set horizon to a number which will never be reached in practice? Or will these results be invalid in some way?

I want to compare the performance of different methods (particularly max causal entropy mce_irl.py) so I don't want to 'short change' the methods at all.

Jul 11 '23 16:07 kierad