Total progress when copying or moving multiple files?
When copying or moving multiple files using e.g. the command cp 1 2 3 target/directory/, those parameters are visible using ps aux.
I'm aware it's also possible to find the current working directory of a process using /proc/<pid>/cwd, so you could use that to determine which files cp / mv has already been operated on, as well as which files it's going to operate on. The command-line arguments are relative to the working directory, but it seems we could easily get that information and find out the full file paths.
With that information, wouldn't it be possible for cv to show the total progress (counting all files to be copied or moved) as well as the current file progress?
I would be willing to work on a pull request for this, but would first want to make sure you think it's doable, whether it's a good idea, etc.
Thanks.
Hi,
While it seems very feasible, I currently don't think it's a good idea. This tool is supposed to be generic, hard coding special behavior specific to some commands does not suit this idea.
This is not the first time that turning cv from a generic tool into a specific command-aware one is being talked, and I'm ready to consider this option, but it will require a lot of work.
Hi. Cute avatar :)
I understand the will to keep the tool generic, I would strive for that too. Maybe a workaround for that would be to allow for Git-like extensions?
In Git, if you have a command called git-blablabla on the path, and you invoke git blablabla, Git will look for a command called git-blablabla on the path and delegate the whole command invocation to it.
Maybe cv could, once it's targetting a specific program, get the program's basename and look for cv-<program name> (e.g. cv-cp) on the path, and, if found, somehow leverage it to get more specific information on that running program.
That would help us extend cv without modifying it, so it could stay generic and clean.
If you were to make cv usage non-generic, would you incorporate program-specific inspection strategies into it, or would you go for an extension system? Have you considered something like that or other ways to extend cv?
Note: the name of the project was changed from cv to progress.
I would like progress to have a switch, such as -a ("all"), that will tell me the progress across all files being moved or copied in a single command. Yes, that's special-casing for programs, but so what - they're part of the core tools, in posix, etc, so they're important. Please do this :)
Note I came here not because I thought "oh, hm, that might be a good idea" but because I use progress every day and it turns out to be a feature I wish for nearly every time.
+1
+1 :+1:
Thinking about a way to make progress able to deal with multiple files per command (and /proc/<pid>/cmdline can probably help) but I can't find any good idea for commands like cp -Rf source/ dest/
How can we get the list of files ? How can we know what was already copied and what's left ? Moving from a generic tool to a "command-aware" one sounds more difficult than I originally thought.
On my computer the order cp does things is deterministic - try figuring that part out. Of course, this won't work if the fs gets modified in the meantime.
Once you know the files' order you can check sizes.
File order for cp -R and other recursive forms of coreutils commands is build with readdir() (coreutils source code, file savedir.c, function savedirstream()) and this function doesn't guarantee any ordering, I can't really build a tool on empirical observation. And we need to duplicate massive amount of code from coreutils to deal with recursive scans, because it's a tricky subject for commands like cp and mv.
It looks like a big non-portable error-prone ugly hack to me. Looking at it, even the non recursive form is full of traps.
Took me a short while to figure it out but the git version of coreutils doesn't contain savedir.c, because that got split off into gnulib, which you can use. It is the "gnu portability library" so it's exactly what we want. I am not sure if you can use savedirstream() as is or if you need to liberally copy from the body of that function, but all that does is book keeping a loop written around readdir(), which is where the portability actually happens.
@Xfennec, now that I understand the difficulties involved in this, I'm also thinking it's just a bad idea. Maybe you could only show total progress when multiple files are specified explicitly via the command-line? E.g. cp 1 2 3 *.whatever target/dir/. Maybe that's a good compromise, since this should actually be easy to implement.
@n2liquid it seems like gnulib provides a lot of the funcitonality needed for -R support.
@cheater The main issue here is the lack of ordering of readdir(). We have no guarantee the file list order will be the same as the one cp (for instance) is building along its recursive copy.
The other issue is the complexity of scan : cross-devices, symbolic links, hard links, duplicate names on CLI, and so on. This code is not part of gnulib (copy.c is a good sample of this complexity) and that's where I see a potential portability trouble (but my main concern is to have to deal with such a huge amount of complex code).
Why does the order matter, though? All you need to know to compute the total progress is the list of file sizes and how many bytes have been copied (which is already implemented).
What is already implemented is only related to the currently opened file. To get "how many bytes have been copied since the beginning", we need to know what is the current file and where is this file in the total (ordered) list of files, so we can deduce the previous files (and sum sizes).
@Xfennec there is no order guarantee, but this code doesn't go out of its way to provide random order either. It'll be the same over two calls, unless something's changed. I don't think people using progress will mind if it's off by a bit.
Basically, the order is provided by the underlying filesystem(s) and some of them are quite complex and I'm pretty sure that a big re-indexing can happen in apparently basic use-cases. If the order of the root directory of the copy change, we're no more talking about a small errors.
The role of progress is to show progress. And here, I have no guarantee that it can do correctly its job. It's hard to accept. (but tempting, I'm with you on that :)
Just throwing it out there: Can't we maybe use mtime and/or atime to help at least detect that a file system re-indexing has started since the start of progress and then revert back to the standard behavior?
In any case, this isn't becoming just a small special case for cp / mv, it's becoming a project of its own (like 100+ sloc). Maybe if someone implements that separately from progress, progress could just use that if it's available in the system? What do you think?
At our level, we've no idea of what the FS is doing. A change at the other end of the disk, by another user, without any relation with our copy/move/whatever -may- change the order of files. There's no guarantee it can't happen.
I am very unmoved by this last argument :)
Also, you could keep tabs on the target using inotify and use that to compute how much has been done. You only need to grab the file list once. If files were added or deleted between cp and you grabbing the list of files you will be inaccurate. But we're giving it our best effort.
@cheater, why don't we hack this progress extension together as some sort of separate tool it can leverage like I mentioned?
No time here
Me neither tbh, but maybe it wouldn't be too time consuming. I might give it a try... Maybe.
Go for it! Would love to see what you can come up with - I bet there are lots of interesting corner cases
I have no intention of covering all corner cases, though. I wish there weren't so many! If I do this, I'll implement what's useful for me and if people need support for say weird NFS mounts and other things, they can send PRs :-)
Since this would be installed on the side and wouldn't be part of progress, it wouldn't need to aim at the same level of quality and support for corner cases as progress. People install it if it's useful to them, or they simply don't, and hopefully everyone would be happy. Thoughts, @Xfennec? Feel free to be brutally honest.
Feel free to experiment, of course. I will keep this subject on my list, too.
What about just calculating total size of Src & destination folder at start and then use this as the pct% metric?
And what if destination folder was already here and is overwritten by the copy?
You're right, this wasn't my case, so I didn't thought of that !