dvc.org icon indicating copy to clipboard operation
dvc.org copied to clipboard

ref: Hydra support for `exp run --set-param`

Open daavoo opened this issue 3 years ago • 9 comments

P.R. for https://github.com/iterative/dvc/pull/8067

daavoo avatar Jul 29 '22 09:07 daavoo

Link Check Report

  • content/docs/command-reference/exp/run.md
    • PASS: https://hydra.cc/docs/advanced/override_grammar/basic/#modifying-the-config-object (200)
    • PASS: https://hydra.cc/docs/advanced/override_grammar/basic/ (200)
    • PASS: https://dvc.org/doc/user-guide/project-structure/dvcyaml-files#parameter-dependencies (200)

All 3 links passed!

github-actions[bot] avatar Jul 29 '22 09:07 github-actions[bot]

One tricky issue here is that we update params.yaml or other params files with the + and ~ syntax to add/remove parameters, but we don't update dvc.yaml.

For example, assume you have a dvc.yaml like:

stages:
  train:
    cmd: python train.py
    params:
    - params.yaml:
      - train.lr
...
  • dvc exp run --set-param=~train.optim will fail because the param will be dropped from params.yaml but dvc.yaml still expects it.
  • dvc exp run --set-param=+train.optim=Adam will add the param to params.yaml but not track it as a new parameter.'

So this syntax only really works when specifying a whole params file or a key like train for a group of params.

dberenbaum avatar Jul 29 '22 15:07 dberenbaum

One tricky issue here is that we update params.yaml or other params files with the + and ~ syntax to add/remove parameters, but we don't update dvc.yaml.

For example, assume you have a dvc.yaml like:

stages:
  train:
    cmd: python train.py
    params:
    - params.yaml:
      - train.lr
...
* `dvc exp run --set-param=~train.optim` will fail because the param will be dropped from `params.yaml` but `dvc.yaml` still expects it.

* `dvc exp run --set-param=+train.optim=Adam` will add the param to params.yaml but not track it as a new parameter.'

So this syntax only really works when specifying a whole params file or a key like train for a group of params.

Good point. Do you think it's worth adding an explicit note in the ref (I was avoiding including info that we will end up extracting to user guide)? We could add an <admon> warning

daavoo avatar Jul 29 '22 15:07 daavoo

Do we need to specify if there is any particular syntax that is not supported? For example, in the current PR, I don't think https://hydra.cc/docs/advanced/override_grammar/basic/#modifying-the-defaults-list is applicable (although it probably will be in an upcoming one).

Should we have examples or is it repeating Hydra docs too much? Possible examples:

  • +newparam=1
  • ~dropparam
  • custom_file.json:+newparam=1
  • +newparam=${oldparam}
  • param=[1,2,3]

dberenbaum avatar Jul 29 '22 15:07 dberenbaum

Leaving high-level product thoughts here instead of https://github.com/iterative/dvc/pull/8067. Overall, it looks great as a first step. Seems more flexible and stable than what we have now.

My only additional concern is to not force users immediately to the Hydra docs to start using --set-param since that's pretty daunting. I think we should make clear the basic syntax at least (--set-param filename:key=value) without referring users to Hydra.

dberenbaum avatar Jul 29 '22 15:07 dberenbaum

Do you think it's worth adding an explicit note in the ref (I was avoiding including info that we will end up extracting to user guide)? We could add an <admon> warning

No strong opinion as long as we don't lose track of documenting it.

dberenbaum avatar Jul 29 '22 15:07 dberenbaum

@dberenbaum Updated with examples for basic syntax and warning about dvc.yaml

daavoo avatar Aug 01 '22 10:08 daavoo

Looks good @daavoo! Can you resolve the merge conflict and do one more iteration? I think we could merge as is, but since there is no rush here, let's try to work on it a bit more.

dberenbaum avatar Aug 03 '22 19:08 dberenbaum

Looks good @daavoo! Can you resolve the merge conflict and do one more iteration? I think we could merge as is, but since there is no rush here, let's try to work on it a bit more.

Tried to address the comments. PTAL

daavoo avatar Aug 08 '22 19:08 daavoo

@jorgeorpinel Do you want to take a final pass at this or not?

dberenbaum avatar Aug 23 '22 17:08 dberenbaum

@alex000kim Maybe you want to review this one? Note that this is only for part 1 of the Hydra integration, which only includes https://github.com/iterative/dvc/pull/8067, not https://github.com/iterative/dvc/pull/8093.

dberenbaum avatar Aug 23 '22 17:08 dberenbaum

Merging to move forward. Don't hesitate to post additional comments and I will address them in subsequent P.R.

daavoo avatar Aug 24 '22 13:08 daavoo

Hi. I probably won't be reviewing all ref PRs in the future but I'm still trying to check as much as possible for some time to compile best practices for each kind of doc. So

we update params.yaml or other params files with the + and ~ syntax to add/remove parameters, but we don't update dvc.yaml... So this syntax only really works when specifying a whole params file or a key like train for a group of params.

Yeah this is intriguing. I guess for now let's try to avoid examples that would need a dvc.yaml update (defeating the purpose of "on-the-fly" param setting).

avoiding including info that we will end up extracting to user guide

There's no PR for that yet right? That's higher level so I should try to be more involved.

Possible examples...

Yes these are all great. I don't think we covered them all in the example after all?

jorgeorpinel avatar Sep 04 '22 06:09 jorgeorpinel