ref: document `dvc queue` and `dvc exp` task-queuing changes
Docs meta-issue for https://github.com/iterative/dvc/issues/7592
- [x] Initial documentation (command refs) will be driven by the core team.
- [x] Update existing
expcommands/flags that get aliased/deprecated or otherwise chaged. - [ ] User guides/ other docs should be updated, but existing functionality is being preserved (so there is no special rush on this rn).
Thanks @pmrowla ! Can you link to any relevant existing materials such as wiki, internal docs, etc? Thanks
There is no existing wiki/documentation. Internal-only proposal/outline for the new CLI can be found in notion: https://www.notion.so/iterative/Queueing-Managing-Experiment-Execution-bb07bf856cd242bd98a2c87cfc6e75d7#3f91a12217f04550a7c95b4a4fe252c1 (final CLI does not completely match the initial proposal)
I think we can at least start with the current help output to give an overview:
$ dvc queue --help
usage: dvc queue [-h] [-q | -v] {start,stop,status,logs,remove,kill} ...
Commands to manage experiments queue.
Documentation: <https://man.dvc.org/queue>
positional arguments:
{start,stop,status,logs,remove,kill}
Use `dvc queue CMD --help` to display command-specific help.
start Start experiments queue workers.
stop Stop experiments queue workers.
status List the status of the queue tasks and workers.
logs Show output logs for a task in the experiments queue.
remove Remove tasks in experiments queue.
kill Kill tasks in experiments queue.
optional arguments:
-h, --help show this help message and exit
-q, --quiet Be quiet.
-v, --verbose Be verbose.
ETA for release? Thanks
We are hoping to have it released by the end of the month @jorgeorpinel
I think it would be ideal to have drafts for the references sooner than later. Those reviews may imply basic QA to improve the release (preferably in several smaller PRs).
p.s. is there a dev branch to play with this?
p.s. is there a dev branch to play with this?
Yes, it's on dvc-task-dev.
A few questions to consider for now.
usage: dvc queue
Shouldn't this be inside dvc exp? I get that it's too many subcommands though but maybe dvc exp-queue then?
start Start experiments queue workers.
I understand that by default this goes into a background process. Is there a way to start it in the foreground? Say queue start --attached
Wait. What happened to queue attach? 🙂
stop Stop experiments queue workers.
Does this kill the running epxs? Do they then become failed?
What about the rest of the queue, does it remain in Queued state?
When you restart, where does it start from?
remove Remove tasks in experiments queue.
Please confirm the difference or overlap between this and exp remove.
Shouldn't this be inside
dvc exp? I get that it's too many subcommands though but maybedvc exp-queuethen?
This was discussed in the initial planning, in the end we went with dvc queue to avoid having too many nested commands.
I understand that by default this goes into a background process. Is there a way to start it in the foreground? Say
queue start --attached
There is currently no flag for this. In the meantime you can use exp run --run-all to get the same behavior, but eventually this can be folded into a flag for queue start
Wait. What happened to
queue attach? 🙂
It was folded into queue logs -f/--follow
Does this kill the running epxs? Do they then become failed? What about the rest of the queue, does it remain in Queued state? When you re
start, where does it start from?
By default queue stop will finish any currently executing experiments and then stops the queue worker. Any remaining queued experiments stay in the queue. queue stop --kill will kill any currently running experiments and stop the queue processing immediately. (Killed experiments will be marked as failed unless the the user's pipeline/stage command has special handling for sigkill/sigterm which is unlikely in typical cases)
Please confirm the difference or overlap between this and
exp remove.
The intended behavior is that be no overlap between the two commands.
-
queue removewill specifically apply to queued experiments and queue artifacts (i.e. it removes queue entries and any saved logs for old queue entries that can be accessed withqueue logs) -
exp removewill specifically apply to successful experiments (i.e. it removes DVC exp git refs and any associated DVC cache data for those exp refs)
The existing --queue related flags for exp remove and gc will be deprecated and eventually removed to make this separation clearer.
I understand that by default this goes into a background process. Is there a way to start it in the foreground? Say queue start --attached
There is currently no flag for this. In the meantime you can use exp run --run-all to get the same behavior, but eventually this can be folded into a flag for queue start
queue logs -f/--follow without a task should automatically follow the currently running experiment in the future.
Please confirm the difference or overlap between this and exp remove.
When developing the queue related feature. What I realized is that queue tasks and experiment are different aspects. Although they are strongly related but are still different. tasks are more focused on execution while experiments are more focused on the result. One checkpoint task can generate dozens of experiments, and experiments can be run without using a queue worker. The difference also comes into the status/show table, we can delete a succeeded task msg and leave the experiment result (revision) untouched.
Are there guides that still need to be updated here?
User guides were updated. If there's anywhere we find the queue info missing, please open a new issue.