dvc icon indicating copy to clipboard operation
dvc copied to clipboard

In monorepo `dvc exp remove -A` should remove only experiments within the sub-dir project scope

Open mnrozhkov opened this issue 2 years ago • 3 comments

Summary / Background

I'm testing DVC Experiments for monorepo scenario. I encountered unexpected behavior for dvc exp remove

Within a mono repo say we have:

- / 
  - project_a
  - project_b
  - root_content

When working inside project_a, I list experiments

dvc exp list  

and get 2 experiments I ran for project_a

main:                                                                 
        5a4cba8 [paled-flus]
        81c2de9 [tippy-scut]

Then, I want to remove all experiments for project_a with

dvc exp remove -A

DVC removes all experiments for all projects in the repo

Removed experiments: 'finer-limb', 'perdu-vase', 'paled-flus', 'heady-mate', 'older-tipi', 'dusty-tang', 'alpha-gyms', 'downy-kiwi', 'moved-bomb', 'butch-iglu', 'silly-fibs', 'olive-roam', 'bosom-curb', 'unlet-soja', 'tippy-scut', 'sassy-dawn', 'braky-baby', 'coaly-kill', 'moldy-moot', 'pappy-gest', 'split-dogs', 'elite-bort', 'shock-dado', 'boozy-bade', 'pucka-thaw', 'mossy-jird', 'splay-tosh', 'famed-afro', 'finer-torc', 'bijou-yolk', 'fetid-mope', 'tangy-trio', 'legal-ludo', 'cagey-sech', 'addle-chic', 'eerie-barb', 'noisy-rods', 'sarky-joey', 'older-jest', 'umber-tote', 'sable-moit', 'aging-doge', 'puffy-esse' and 'power-harl'

Expected behavior

dvc exp remove -A should remove only experiments within project_a scope

mnrozhkov avatar Jan 17 '24 09:01 mnrozhkov

Thanks for the report!

DVC CLI does not do any monorepo slicing at the moment -- that is limited to Studio. The reason you only saw 2 experiments in dvc exp list is because you did not include -A for that command. All dvc exp commands will look at the entire repo.

dberenbaum avatar Jan 17 '24 20:01 dberenbaum

Looks like https://github.com/iterative/dvc/issues/10244#issuecomment-1899063451 is related.

I didn't notice at first that these are entirely different dvc projects with project_a/.dvc and project_b/.dvc. @pmrowla Thoughts on how dvc exp commands should handle this scenario? See the link above, where it looks like we end up pushing two copies of each experiment from each project.

dberenbaum avatar Jan 18 '24 21:01 dberenbaum

I'm not sure we are actually pushing two copies of the experiment, I'm guessing that there are actually two separate experiments with the same name being generated.

git refs (and exps) apply to the entire repository. If we actually need to separate exps by subrepo, then we need to extend the exp ref namespace to differentiate them in the dvc init --subdir case

so the exp refs would go somewhere like

refs/exps/<git-sha>/subrepo/path/to/subdir/exp-name

pmrowla avatar Jan 19 '24 01:01 pmrowla