zienn

Results 7 comments of zienn

已经拉取镜像成功了的

> Does it based on the cumulative reward? Did you find the most appropriate parameters for mujoco tasks?

It seems always rerun these codes. ``` while self.experience_out_queue[index].empty() and not self.stop_sign.value: index = np.random.randint(0, self.args.num_buffers) time.sleep(0.1) ```

I found ACTOR.py put data in "experience_in_queue". But Leaner.py gets data from "experience_out_queue, so the learner's buffer queue is always empty. There may be a lack of code to transfer...

I found out what the problem was. It cannot work when using multiprocessing actor, the actor.py run() stops at "get_action()" ``` for i in range(self.args.max_step-1): state_tensor = torch.FloatTensor(self.state.copy()).to(self.device) if self.args.NN_type...

Here is the hprof file download [link](https://drive.google.com/file/d/1smRSxKAyNvTam0oaGf4WgnyQRQ9V-AH9/view?usp=sharing)