progress icon indicating copy to clipboard operation
progress copied to clipboard

Total progress when copying or moving multiple files?

Open guiprav opened this issue 10 years ago • 40 comments

When copying or moving multiple files using e.g. the command cp 1 2 3 target/directory/, those parameters are visible using ps aux.

I'm aware it's also possible to find the current working directory of a process using /proc/<pid>/cwd, so you could use that to determine which files cp / mv has already been operated on, as well as which files it's going to operate on. The command-line arguments are relative to the working directory, but it seems we could easily get that information and find out the full file paths.

With that information, wouldn't it be possible for cv to show the total progress (counting all files to be copied or moved) as well as the current file progress?

I would be willing to work on a pull request for this, but would first want to make sure you think it's doable, whether it's a good idea, etc.

Thanks.

guiprav avatar Jul 30 '15 01:07 guiprav

Hi,

While it seems very feasible, I currently don't think it's a good idea. This tool is supposed to be generic, hard coding special behavior specific to some commands does not suit this idea.

This is not the first time that turning cv from a generic tool into a specific command-aware one is being talked, and I'm ready to consider this option, but it will require a lot of work.

Xfennec avatar Jul 30 '15 07:07 Xfennec

Hi. Cute avatar :)

I understand the will to keep the tool generic, I would strive for that too. Maybe a workaround for that would be to allow for Git-like extensions?

In Git, if you have a command called git-blablabla on the path, and you invoke git blablabla, Git will look for a command called git-blablabla on the path and delegate the whole command invocation to it.

Maybe cv could, once it's targetting a specific program, get the program's basename and look for cv-<program name> (e.g. cv-cp) on the path, and, if found, somehow leverage it to get more specific information on that running program.

That would help us extend cv without modifying it, so it could stay generic and clean.

If you were to make cv usage non-generic, would you incorporate program-specific inspection strategies into it, or would you go for an extension system? Have you considered something like that or other ways to extend cv?

guiprav avatar Jul 30 '15 15:07 guiprav

Note: the name of the project was changed from cv to progress. I would like progress to have a switch, such as -a ("all"), that will tell me the progress across all files being moved or copied in a single command. Yes, that's special-casing for programs, but so what - they're part of the core tools, in posix, etc, so they're important. Please do this :)

cheater avatar Oct 31 '15 10:10 cheater

Note I came here not because I thought "oh, hm, that might be a good idea" but because I use progress every day and it turns out to be a feature I wish for nearly every time.

cheater avatar Oct 31 '15 10:10 cheater

+1

groggemans avatar Nov 28 '15 10:11 groggemans

+1 :+1:

Ericmas001 avatar Jan 06 '16 01:01 Ericmas001

Thinking about a way to make progress able to deal with multiple files per command (and /proc/<pid>/cmdline can probably help) but I can't find any good idea for commands like cp -Rf source/ dest/

How can we get the list of files ? How can we know what was already copied and what's left ? Moving from a generic tool to a "command-aware" one sounds more difficult than I originally thought.

Xfennec avatar Jan 21 '16 10:01 Xfennec

On my computer the order cp does things is deterministic - try figuring that part out. Of course, this won't work if the fs gets modified in the meantime.

Once you know the files' order you can check sizes.

cheater avatar Jan 21 '16 12:01 cheater

File order for cp -R and other recursive forms of coreutils commands is build with readdir() (coreutils source code, file savedir.c, function savedirstream()) and this function doesn't guarantee any ordering, I can't really build a tool on empirical observation. And we need to duplicate massive amount of code from coreutils to deal with recursive scans, because it's a tricky subject for commands like cp and mv.

It looks like a big non-portable error-prone ugly hack to me. Looking at it, even the non recursive form is full of traps.

Xfennec avatar Jan 21 '16 13:01 Xfennec

Took me a short while to figure it out but the git version of coreutils doesn't contain savedir.c, because that got split off into gnulib, which you can use. It is the "gnu portability library" so it's exactly what we want. I am not sure if you can use savedirstream() as is or if you need to liberally copy from the body of that function, but all that does is book keeping a loop written around readdir(), which is where the portability actually happens.

cheater avatar Jan 21 '16 14:01 cheater

@Xfennec, now that I understand the difficulties involved in this, I'm also thinking it's just a bad idea. Maybe you could only show total progress when multiple files are specified explicitly via the command-line? E.g. cp 1 2 3 *.whatever target/dir/. Maybe that's a good compromise, since this should actually be easy to implement.

guiprav avatar Jan 21 '16 14:01 guiprav

@n2liquid it seems like gnulib provides a lot of the funcitonality needed for -R support.

cheater avatar Jan 21 '16 14:01 cheater

@cheater The main issue here is the lack of ordering of readdir(). We have no guarantee the file list order will be the same as the one cp (for instance) is building along its recursive copy.

The other issue is the complexity of scan : cross-devices, symbolic links, hard links, duplicate names on CLI, and so on. This code is not part of gnulib (copy.c is a good sample of this complexity) and that's where I see a potential portability trouble (but my main concern is to have to deal with such a huge amount of complex code).

Xfennec avatar Jan 21 '16 14:01 Xfennec

Why does the order matter, though? All you need to know to compute the total progress is the list of file sizes and how many bytes have been copied (which is already implemented).

guiprav avatar Jan 21 '16 14:01 guiprav

What is already implemented is only related to the currently opened file. To get "how many bytes have been copied since the beginning", we need to know what is the current file and where is this file in the total (ordered) list of files, so we can deduce the previous files (and sum sizes).

Xfennec avatar Jan 21 '16 14:01 Xfennec

@Xfennec there is no order guarantee, but this code doesn't go out of its way to provide random order either. It'll be the same over two calls, unless something's changed. I don't think people using progress will mind if it's off by a bit.

cheater avatar Jan 21 '16 15:01 cheater

Basically, the order is provided by the underlying filesystem(s) and some of them are quite complex and I'm pretty sure that a big re-indexing can happen in apparently basic use-cases. If the order of the root directory of the copy change, we're no more talking about a small errors.

The role of progress is to show progress. And here, I have no guarantee that it can do correctly its job. It's hard to accept. (but tempting, I'm with you on that :)

Xfennec avatar Jan 21 '16 15:01 Xfennec

Just throwing it out there: Can't we maybe use mtime and/or atime to help at least detect that a file system re-indexing has started since the start of progress and then revert back to the standard behavior?

In any case, this isn't becoming just a small special case for cp / mv, it's becoming a project of its own (like 100+ sloc). Maybe if someone implements that separately from progress, progress could just use that if it's available in the system? What do you think?

guiprav avatar Jan 21 '16 15:01 guiprav

At our level, we've no idea of what the FS is doing. A change at the other end of the disk, by another user, without any relation with our copy/move/whatever -may- change the order of files. There's no guarantee it can't happen.

Xfennec avatar Jan 21 '16 15:01 Xfennec

I am very unmoved by this last argument :)

cheater avatar Jan 21 '16 16:01 cheater

Also, you could keep tabs on the target using inotify and use that to compute how much has been done. You only need to grab the file list once. If files were added or deleted between cp and you grabbing the list of files you will be inaccurate. But we're giving it our best effort.

cheater avatar Jan 21 '16 16:01 cheater

@cheater, why don't we hack this progress extension together as some sort of separate tool it can leverage like I mentioned?

guiprav avatar Jan 21 '16 16:01 guiprav

No time here

cheater avatar Jan 21 '16 16:01 cheater

Me neither tbh, but maybe it wouldn't be too time consuming. I might give it a try... Maybe.

guiprav avatar Jan 21 '16 16:01 guiprav

Go for it! Would love to see what you can come up with - I bet there are lots of interesting corner cases

cheater avatar Jan 21 '16 16:01 cheater

I have no intention of covering all corner cases, though. I wish there weren't so many! If I do this, I'll implement what's useful for me and if people need support for say weird NFS mounts and other things, they can send PRs :-)

Since this would be installed on the side and wouldn't be part of progress, it wouldn't need to aim at the same level of quality and support for corner cases as progress. People install it if it's useful to them, or they simply don't, and hopefully everyone would be happy. Thoughts, @Xfennec? Feel free to be brutally honest.

guiprav avatar Jan 21 '16 16:01 guiprav

Feel free to experiment, of course. I will keep this subject on my list, too.

Xfennec avatar Jan 21 '16 16:01 Xfennec

What about just calculating total size of Src & destination folder at start and then use this as the pct% metric?

Ericmas001 avatar Jan 22 '16 13:01 Ericmas001

And what if destination folder was already here and is overwritten by the copy?

Xfennec avatar Jan 22 '16 14:01 Xfennec

You're right, this wasn't my case, so I didn't thought of that !

Ericmas001 avatar Jan 23 '16 03:01 Ericmas001