DeepPath icon indicating copy to clipboard operation
DeepPath copied to clipboard

Retraining code is a little different from the algorithm decription.

Open zdh2292390 opened this issue 4 years ago • 0 comments

In policy_agent.py, the retraining code, why there is a BFS teacher-guided training after the agent failed? This is not the same as the algorithm decription. Does this mean BFS is the upper bound of the RL agent?

zdh2292390 avatar Apr 14 '21 13:04 zdh2292390