imitation
imitation copied to clipboard
Approximate a finite horizon environment?
Hello 👋 This is a question, not a feature request - I hope that's alright.
I understand that this repo doesn't support infinite horizon episodes. The gridworld environment I want to use technically has infinite horizon, but in practice episodes are always finite and of variable length (all positive reward states are either 'one-time only', such as a coin that can only be collected once, or they are terminal states. Will I see valid results if I set horizon to a number which will never be reached in practice? Or will these results be invalid in some way?
I want to compare the performance of different methods (particularly max causal entropy mce_irl.py) so I don't want to 'short change' the methods at all.