garage
garage copied to clipboard
Rework logic for filling and checking replay buffer in torch sac, dog, and td3
Currently in sac, train once returns none if the replay buffer doesn't have the minimum number of timesteps in it.
This function should still return some value or raise an exception.
_train_once is either private, or should be private, so I'm not sure what purpose this exception would have.