Robin Ranjit Singh Chauhan
Robin Ranjit Singh Chauhan
In the above code you are not doing any training, just running getting random action. Look at https://github.com/google/dopamine/blob/master/dopamine/agents/dqn/dqn_agent.py , _train_step() is called from within step(), you dont use step(). Look...
I see what you mean! Sorry not sure the answer. But you can try to integrate cartpole in your example to ensure it makes sense.
note dqn usually has a very long warmup before training is allowed to begin: https://github.com/google/dopamine/blob/master/dopamine/agents/dqn/dqn_agent.py#L73 ``` min_replay_history: int, number of transitions that should be experienced before the agent begins training...