Jo_QIU

Results 11 comments of Jo_QIU

Sorry, I have a question. Now I have a setpoint control problem and used ppo_continuous_action.py for training. in evaluation, I write the code as the following: ```python env = gym.vector.SyncVectorEnv(...

@vwxyzjn update the question above. I just remove all env.wrappers except for FlattenObservation() and RecordEpisodicStatistics() because I have already did normalization in my env.py. From the print info, reward converges...

> Roughly speaking, you should do > > ``` > eval_env.obs_rms = env.obs_rms > eval_env.return_rms = env.return_rms > ``` I have removed these wrappers but it still shows bad performance...

yes.I tried to remove them in make_env function which means the agent was trained on a original env but the result was the same. **you mean, set ```eval_env.obs_rms=env.obs_rms``` and the...

[ppo_continuous_action.txt](https://github.com/vwxyzjn/cleanrl/files/11803623/ppo_continuous_action.txt) Sorry again. **In order to upload the file I change the suffix to txt.** At the last of the code I add the part for regular evaluation. I tried...

> PPO does not solve `Pendulum-v1` if I recall correctly. Pendulum-v1 is also a continuous action env and I test it. And here is the result.

> PPO does not solve `Pendulum-v1` if I recall correctly. Sorry, No matter how, My question is why the evaluation is failed. Please, I really need it.

> > Maybe @JamesKCS can share his snippet? > > My solution is hacky, since I just load the wrapper from one environment, rather than the more correct way which...

> Hi, I also meet the same challenge, have you solved it? No... I didn't manage to solve the problem because it seems that the curve looks no problem..

> Did you end up solving this? I have been bumping my head onto this the entire day.... Yes, after I carefully read the error, all of these direct to...