roefer
roefer
In `ReinforcementLearning/PolicyGradient/SAC/tf2/networks.py:94`, you basically compute the following: tf.math.log(1 - tf.math.pow(tf.math.tanh(actions) * self.max_action, 2) + self.noise) By multiplying with `self.max_action` (which is 3, at least with MuJoCo), you increase the likelihood...
SimRobot is a single-threaded app. Therefore, the physics simulation and the computation of the sensor data are all computed sequentially. The physics library Open Dynamics Engine SimRobot uses has the...
Sorry, I don't have a pull request, but this patch fixes the issue. It basically just does, what [this post](https://intellij-support.jetbrains.com/hc/en-us/community/posts/18697727524754-Deprecation-warning-with-ActionUpdateThread) suggests. [pit-idea-plugin-15-33-47.patch](https://github.com/user-attachments/files/17413297/pit-idea-plugin-15-33-47.patch)