fsoul

Results 11 issues of fsoul

I find the code loads the pretrained weights in training. I tried to train without pretrained weight. But it seems a wrong operations. There is my result. ![image](https://user-images.githubusercontent.com/16297710/83408215-d8305900-a444-11ea-81e7-50ef1c42d909.png)

why did the code require only one env when using rnn policy? https://github.com/marlbenchmark/off-policy/blob/release/offpolicy/scripts/train/train_mpe.py#L154

When I tried to train, it showed that RuntimeError: cublas runtime error : the GPU program failed to execute at /pytorch/aten/src/THC/THCBlas.cu:441 Can anyone help?

I retrain the ppo in treechop environment. But the result is different from paper. I only get 20 reward final. I didn't change anything. What problem would it be?

The observations in Antmaze is like[qpos, qvel]. But there is difference between dataset['observations'] and dataset['infos/qpos'], dataset['infos/qvel']. ![1677486330536](https://user-images.githubusercontent.com/16297710/221511696-c8ca1c4d-5eca-4990-b32b-9306a40a83dd.png)

Hello, I tried the same config with the repo and got the same good performance with the paper. However, when I tried the env halfcheetah and the testing score is...

https://github.com/mmatl/urdfpy/blob/5466842899b33bd549e8f9e2a9a987bd5e37373b/urdfpy/urdf.py#L898 It should be np.float64...