Breaking the tutorial by getting TotalRewardPerEpisode out of sync with the stopping condition in a `run` call

Open colintbowers opened this issue 2 years ago • 0 comments

The following code is taken from the tutorial and slightly modified to produce a fairly unintuitive error message:

using ReinforcementLearning, ReinforcementLearningBase, Flux
env1 = RandomWalk1D()
agent1 = Agent(policy=QBasedPolicy(learner = MonteCarloLearner( ; 
                                        approximator=TabularQApproximator( ; 
                                        n_state = length(state_space(env1)),
                                        n_action = length(action_space(env1)),
                                        opt = InvDecay(1.0))),
                                explorer = EpsilonGreedyExplorer(0.1)),
            trajectory=VectorSARTTrajectory()
)
stopcond1 = StopAfterEpisode(10)
hook1 = TotalRewardPerEpisode()
run(agent1, env1, stopcond1, hook1)

# Renew the hook and try another run
hook1 = TotalRewardPerEpisode()
run(agent1, env1, stopcond1, hook1)

The final run throws an error:

ERROR: MethodError: reducing over an empty collection is not allowed; consider supplying `init` to the reducer

The reward vector in hook1 is getting out of sync with the stopping condition, but this isn't really clear to the user from the error message, and it may cause anyone new the package some issues given that it isn't difficult to get to this point from the tutorial. I'm using [158674fc] ReinforcementLearning v0.10.2, and ⌅ [e575027e] ReinforcementLearningBase v0.9.7. I know ReinforcementLearningBase has later versions, but status --outdated indicates it is other packages from the ReinforcementLearning group that are keeping it back on v0.9.7.

Cheers,

Colin

Nov 14 '23 02:11 colintbowers