mpifileutils icon indicating copy to clipboard operation
mpifileutils copied to clipboard

dcp: copy lustre striping params, even if -p not thrown

Open adammoody opened this issue 8 years ago • 12 comments

Currently we only copy lustre stripe params if user adds -p, but it's likely they want this, even if they don't want to update timestamps and permissions. Enable lustre striping by default.

adammoody avatar Apr 02 '17 19:04 adammoody

Hmm, we are not very fond of this idea. Our users are using dcp between different Lustre filesystems, with different versions or different features, and we don't always want lustre striping params (lustre.lov) to be copied over. A real case example is that we have a scratch filesystem (1) with both DoM and PFL enabled, and a longer-term Lustre filesystem (2) with traditional striping( without DoM nor PFL), and when copying from (1) to (2), dcp -p generates a LOT of warnings (EINVAL -22) in that case.

The best for us would be an option separate from -p to enable/disable the copy of the Lustre striping info.

thiell avatar Jul 26 '19 23:07 thiell

... also when copying from (2) to (1) we don't want to copy the traditional striping info over, indeed we much prefer that DoM and PFL do their job on filesystem (1). I hope this makes sense. If at all possible, we would actually prefer to disable copying of all striping info with dcp, but still allow users to use -p to preserve permissions and ACLs. Thanks!

thiell avatar Jul 26 '19 23:07 thiell

Thanks for your suggestions and use cases @thiell . We should probably explode this ticket into 3 parts:

  1. We can look into splitting -p into sub-options so that one can select whether to copy timestamps, groups, permissions, and xattrs independently.

  2. As Lustre piggybacks its striping info in xattrs, it would also be nice to have a way to copy xattrs but optionally exclude Lustre attributes.

  3. It would be useful to be able to specify Lustre striping parameters via options, like we do in dstripe.

@adilger , thought I'd ping you for your input, too. If we add options to specify Lustre striping parameters, perhaps we should consider how we'd want to construct a syntax for progressive striping.

adammoody avatar Jul 29 '19 18:07 adammoody

Stephane, it is understandable that you would want the target layout to be based on the capabilities of the destination filesystem. In the case of DoM+PFL source to non-PFL destination, the layout parameters should be based on the "plain" layout parameters (stripe_size, stripe_count) of the last component of the file. These are essentially the "best" layout parameters for the particular file, given its current size.

What is interesting is that on the reverse operation (copy from non-PFL/DoM source to PFL+DoM target), I would argue that the same is true, that you would want to keep the "plain" layout parameters for the file rather than use the full PFL layout. The reason is that PFL layouts are a compromise based on the lack of information about size and usage at the time the file is first created, typically picking DoM and/or a small stripe_count for small files to minimize overhead, and a large stripe_count (and maybe a larger stripe_size) for large files to maximize throughput and balance space usage.

If we know the file size at the time it is being migrated, then why would we add the extra overhead of a few stripes + DoM for a few MiB at the start of the file, when the file is already known to be large and should be widely striped? My thought has always been that in the case of data migration (either between filesystems or within them) it makes sense to use a plain layout, and drop the DoM component of the file, for the target file. This should result in slightly better performance (fewer objects for the target file), and also reduce space usage on the MDT for the first MiB(s) that do not really improve performance on a multi-GiB file.

adilger avatar Jul 29 '19 21:07 adilger

This makes sense @adilger. I agree with you for files that have properly been striped in the first place (on the source filesystem). Those are usually files that have been properly striped by the user or a Lustre-aware application. But in my experience, it's not the majority. What if we were able to distinguish between default striping and custom striping? In that case, it would make sense to preserve the custom striping, assuming it's already the best one. But for files with the filesystem default striping, in most case you would not want it to be preserved, because why would it be better than the target layout? But I'm not sure it's easily doable as of today though.

thiell avatar Jul 29 '19 23:07 thiell

Ah, we totally forgot about /etc/xattr.conf and just remember this [1]. It's useful and works well with mv. I'm not sure in which case exactly (and officially) this file should be used but perhaps a simple solution would be that mpifileutils check for this xattr.conf config file too?

Example of error from fir (2.12 with DoM/PFL) to oak (2.10 with plain layout)

$ cat /etc/xattr.conf 
lustre.lov  skip

$ dcp -p sh-06-34 $OAK/sthiell/
[2019-08-02T16:46:40] [0] [dcp.c:1218] Preserving file attributes.
...
[2019-08-02T16:46:40] [0] [common.c:250] Failed to set value for name=lustre.lov on /oak/stanford/groups/ruthm/sthiell/sh-06-34 llistxattr() errno=22 Invalid argument
...
[2019-08-02T16:46:40] [0] [common.c:250] Failed to set value for name=lustre.lov on /oak/stanford/groups/ruthm/sthiell/sh-06-34/linux-4.20.5.tar.xz llistxattr() errno=22 Invalid argument
...

[1] https://groups.google.com/d/msg/lustre-discuss-list/WWKPrgWlKJQ/IIirSrVv67cJ

thiell avatar Aug 02 '19 23:08 thiell

I think in the case of setxattr("lustre.lov") there are a couple of things to be aware of:

  • it should either be set with mknod() + setxattr() before the file is opened
  • I thought that we didn't return an error in the case of setxattr() when the file already has a layout, but in any case this error can be ignored if copying files across filesystems, it will use the default layout

Cheers, Andreas

On Aug 2, 2019, at 17:53, thiell [email protected] wrote:

Ah, we totally forgot about /etc/xattr.conf and just remember this [1]. It's useful and works well with mv. I'm not sure in which case exactly (and officially) this file should be used but perhaps a simple solution would be that mpifileutils check for this xattr.conf config file too?

Example of error from fir (2.12 with DoM/PFL) to oak (2.10 with plain layout)

$ cat /etc/xattr.conf lustre.lov skip

$ dcp -p sh-06-34 $OAK/sthiell/ [2019-08-02T16:46:40] [0] [dcp.c:1218] Preserving file attributes. ... [2019-08-02T16:46:40] [0] [common.c:250] Failed to set value for name=lustre.lov on /oak/stanford/groups/ruthm/sthiell/sh-06-34 llistxattr() errno=22 Invalid argument ... [2019-08-02T16:46:40] [0] [common.c:250] Failed to set value for name=lustre.lov on /oak/stanford/groups/ruthm/sthiell/sh-06-34/linux-4.20.5.tar.xz llistxattr() errno=22 Invalid argument ... [1] https://groups.google.com/d/msg/lustre-discuss-list/WWKPrgWlKJQ/IIirSrVv67cJ

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

adilger avatar Aug 03 '19 00:08 adilger

I don't think dcp returns an error at the end, but since we recommended to our users the use of dcp to move large quantity of data from lustre to lustre, several of our users recently had concern about these warning messages. Users likely don't know that they mean and they had concern regarding the copy. We would like to avoid such user-facing warning messages, that's basically our main problem right now with dcp. We also had tickets coming when using mv from lustre to lustre but as I mentioned, we were able to disable the warnings using the xattr.conf trick. Hope this makes sense.

thiell avatar Aug 03 '19 18:08 thiell

By "this error can be ignored" I mean that dcp should not print any error message if setxattr("lustre.lov") or setxattr("trusted.lov") returns an error when copying a file, since there isn't really anything that the user can do about it.

adilger avatar Aug 07 '19 17:08 adilger

We could filter error messages to avoiding printing anything for failures on lustre.lov and trusted.lov.

Are there any other Lustre related attributes to watch out for?

Are there any valid error conditions for these attributes that we should still be careful to print?

I think it also makes sense to look at adding support for processing entries in /etc/xattr.conf. That file is new to me. Is there a good reference for documentation on its syntax?

adammoody avatar Aug 12 '19 21:08 adammoody

On Aug 12, 2019, at 3:39 PM, Adam Moody [email protected] wrote:

We could filter error messages to avoiding printing anything for failures on lustre.lov and trusted.lov.

Are there any other Lustre related attributes to watch out for?

There are several xattrs that are visible to root that are not valid to copy: trusted.lma, trusted.som, trusted.lmv, trusted.hsm, etc.

Normally Lustre will just quietly eat them if set.

Are there any valid error conditions for these attributes that we should still be careful to print?

For copy I don't think so. If the xattr copy is not done before the file is opened, then there isn't really much that can be done by setting the xattr.

I think it also makes sense to look at adding support for processing entries in /etc/xattr.conf. That file is new to me. Is there a good reference for documentation on its syntax?

The xattr.conf file used to be documented I thought, but I couldn't find anything out about it from the current man pages.

Cheers, Andreas

adilger avatar Aug 15 '19 00:08 adilger

My new PR #503 almost addresses this, I think, but currently, in dcp, --copy-xattrs is only allowed in combination with --preserve. It sounds like I should make them independent. I'll take a look at that.

One idea mentioned above was suppressing error messages when setxattr("lustre.lov") or setxattr("trusted.lov") returns an error. I think that is not appropriate if they are only copied due to the user specifying --copy-xattrs=all or --copy-xattrs=libattr. Do you agree?

My thinking is that dcp and dsync should behave the same way WRT copying xattrs - that the defaults should be the same, and the options to change those defaults should be the same. Does anyone here think otherwise?

ofaaland avatar Oct 14 '21 20:10 ofaaland