allow_val_change parameter is not passed to wandb.config.update in the WandBTracker
As far as I know, setting allow_val_change=True when calling wandb.init(...) allows you to have hyperparameter values that change over time (like a scheduled learning rate). However, for this to work, allow_val_change also needs to be passed to wandb.config.update(...). The stub can be viewed here:
https://github.com/wandb/wandb/blob/3966b7740896127c88bf2c143a8dd5114b5de20d/wandb/sdk/wandb_config.py#L181
And it is not passed to update in accelerate's code:
https://github.com/huggingface/accelerate/blob/b52b793ea8bac108ba61192eead3cf11ca02433d/src/accelerate/tracking.py#L223
This results in exceptions like:
[1,mpirank:0,algo-1]<stderr>: File "/opt/ml/code/viso_ml_train/training/train.py", line 321, in <module>#015
--
[1,mpirank:0,algo-1]<stderr>: accelerator.init_trackers(#015
[1,mpirank:0,algo-1]<stderr>: File "/opt/conda/lib/python3.8/site-packages/accelerate/accelerator.py", line 1070, in init_trackers#015
[1,mpirank:0,algo-1]<stderr>: tracker.store_init_configuration(config)#015
[1,mpirank:0,algo-1]<stderr>: File "/opt/conda/lib/python3.8/site-packages/accelerate/tracking.py", line 223, in store_init_configuration#015
[1,mpirank:0,algo-1]<stderr>: wandb.config.update(values)#015
[1,mpirank:0,algo-1]<stderr>: File "/opt/conda/lib/python3.8/site-packages/wandb/sdk/wandb_config.py", line 178, in update#015
[1,mpirank:0,algo-1]<stderr>: sanitized = self._update(d, allow_val_change)#015
[1,mpirank:0,algo-1]<stderr>: File "/opt/conda/lib/python3.8/site-packages/wandb/sdk/wandb_config.py", line 171, in _update#015
[1,mpirank:0,algo-1]<stderr>: sanitized = self._sanitize_dict(#015
[1,mpirank:0,algo-1]<stderr>: File "/opt/conda/lib/python3.8/site-packages/wandb/sdk/wandb_config.py", line 231, in _sanitize_dict#015
[1,mpirank:0,algo-1]<stderr>: k, v = self._sanitize(k, v, allow_val_change)#015
[1,mpirank:0,algo-1]<stderr>: File "/opt/conda/lib/python3.8/site-packages/wandb/sdk/wandb_config.py", line 251, in _sanitize#015
[1,mpirank:0,algo-1]<stderr>: raise config_util.ConfigError(
[1,mpirank:0,algo-1]<stderr>:wandb.sdk.lib.config_util.ConfigError: Attempted to change value of key "learning_rate" from 9e-05 to 9e-05#015
[1,mpirank:0,algo-1]<stderr>:If you really want to do this, pass allow_val_change=True to config.update()
Crazy speed on that PR @muellerzr , thank you. Always impressed by the accelerate team.
@plamb-viso we've decided in this case you should do this manually, since it's extremely niche to wandb only where this should happen on configs.
(And it starts to build a very confusing API, not good :) )
So to do what you'd like you can perform the following after doing Accelerator.init_trackers(), assuming that no configuration was passed in:
if accelerator.is_main_process:
wandb.config.update(values, allow_val_change=True)
If that winds up not being enough, you should just initialize and use your own trackers outside the Accelerator and interact with them under a if accelerator.is_main_process block like above, as it's there to provide a lightweight general interface.
We've also introduced a way to interact with the current experiment runner as well, so you can get to it without searching efficiently: https://github.com/huggingface/accelerate/pull/594
@muellerzr good suggestion. For others that find this, this is how i got it to work:
accelerator = Accelerator(
log_with='wandb'
)
# initialize wandb
if accelerator.is_main_process:
os.environ['WANDB_API_KEY'] = config.wandb_api_key
wandb_kwargs = {
'entity': config.wandb_entity,
'group': config.model_name,
'job_type': job_type_str,
'name': run_name,
'allow_val_change': True
}
accelerator.init_trackers(
project_name=config.wandb_project,
init_kwargs={'wandb': wandb_kwargs}
)
wandb.config.update(hyperparams, allow_val_change=wandb_kwargs['allow_val_change'])
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
Please note that issues that do not follow the contributing guidelines are likely to be ignored.