off-policy
off-policy copied to clipboard
Some questions about the code
why did the code require only one env when using rnn policy? https://github.com/marlbenchmark/off-policy/blob/release/offpolicy/scripts/train/train_mpe.py#L154
我也想问一样的问题