Daniel Palenicek
Daniel Palenicek
* raylab version: 0.14.14 * Python version: 3.7.9 * Operating System: Ubuntu 18.04.5 LTS When running `raylab experiment "PG" --config examples/PG/cartpole_defaults.py` I get the following error: ``` Traceback (most recent...
The way it was before only the very first marine would be sent and the others would just wait at the barrack's rally point. This way The marines will wait...
### 🚀 Feature I would like to implement CrossQ (https://openreview.net/pdf?id=PczQtTsTIX) in SB3, as also suggested by @araffin (https://github.com/araffin/sbx/pull/36#issuecomment-2027392759), ### Motivation CrossQ is one of the current state-of-the-art deep reinforcement learning...
This PR implements CrossQ (https://openreview.net/pdf?id=PczQtTsTIX), a novel off-policy deep RL algorithm that carefully uses batch normalisation and removes target networks to achieve state-of-the-art sample efficiency at a much lower computational...
## Description Added hyperparameters for SB3 CrossQ. ## Motivation and Context - [ ] I have raised an issue to propose this change ([required](https://github.com/DLR-RM/stable-baselines3/blob/master/CONTRIBUTING.md) for new features and bug fixes)...
I propose adding a batch renormalization (BatchRenorm) layer to flax. I would be happy to make a PR. BatchRenorm (https://arxiv.org/pdf/1702.03275.pdf) is an improved version of the vanilla BatchNorm layer. The...