evaluation results being deterministic
Hello,
Even when evaluating with different random seeds the results of the transitions are the same for both TAF and Meta-BO algorithms. Is there any other way of setting the seed except changing the env_seed_offset variable in the following:
# define evaluation run
eval_spec = {
"env_id": env_spec["env_id"],
"env_seed_offset": 100,
"policy": af,
"logpath": logpath,
"load_iter": load_iter,
"deterministic": deterministic,
"policy_specs": policy_specs,
"savepath": savepath,
"n_workers": n_workers,
"n_episodes": n_episodes,
"T": env_spec["T"],
}
Hello, could one of the authors please help me with the above?
Hi,
thank you for your question!
The implementation uses n_workers parallel processes to sample objective functions and generate transitions using the actions coming from the policy. Each worker has it's individual random seed, the worker seeds are given by
env_seeds = env_seed_offset + np.arange(n_workers)
Thus, you should change the env_seed_offset in steps of n_workers to obtain different objective functions for all the workers.
Best, Michael
Hello Michael,
Thank you for your response, however, I am still having some trouble understanding some implementation details, if you could please help me with the following questions concerning hyperparameter optimization for 1 new test dataset (task):
- there would be only 1 objective function, or is that not the case?
- the episode will be set to 1 as well, no?
- if i wish to have multiple workers like you suggested above, i get the following assertion exception:
assert n_episodes % n_workers == 0
Hello, could one of the authors please help me with the above?
Hi,
- Yes, one task corresponds to one objective function for optimization.
- Yes, as you want to evaluate only one task you set n_episodes = 1 (in RL-terms, one episode corresponds to one optimization run)
- During evaluation, it is typically not necessary to use multiple workers, as evaluation runtime is no issue due to comparably small number of evaluation tasks. I would leave n_workers = 1 for evaluation. On which dataset are you evaluating? If the data generation does not contain any random parameters, changing the env_seed won't necessarily change the data and thus also not the evaluation results.
Best, Michael