HDDS-11405. Implement setrep command for Ozone FS.
What changes were proposed in this pull request?
Currently this command does not work, as on demand changing of replication is not supported in Ozone. However the new atomic rewriteKey API , makes it possible to rewrite key with the new replication and setrep can be implemented using this. This is only for RATIS keys and does not apply for EC keys as there is no point of replication factor in EC keys.
What is the link to the Apache JIRA
https://issues.apache.org/jira/browse/HDDS-11405
How was this patch tested?
Unit tests added. Also tested the ozone fs shell
bash-4.4$ ozone sh key info vol1/buck1/key1 | grep "replicationFactor"
"replicationFactor" : "ONE",
bash-4.4$ ozone fs -ls ofs://om/vol1/buck1/key1
-rw-rw-rw- 1 hadoop hadoop 4068 2024-09-05 08:36 ofs://om/vol1/buck1/key1
bash-4.4$ ozone fs -setrep -w 3 ofs://om/vol1/buck1/key1
Replication 3 set: ofs://om/vol1/buck1/key1
Waiting for ofs://om/vol1/buck1/key1 ... done
bash-4.4$ ozone sh key info vol1/buck1/key1 | grep "replicationFactor"
replicationFactor : THREE,
bash-4.4$ ozone fs -setrep -w 2 ofs://om/vol1/buck1/key1
setrep: Replication factor of 2 not supported
setrep with and without -w
bash-4.4$ ozone fs -setrep 3 ofs://om/s3v/buck/key2
-setrep: Asynchronous set rep is not supported,Please use -w arg
Usage: ozone fs [generic options]
[-appendToFile [-n] <localsrc> ... <dst>]
[-cat [-ignoreCrc] <src> ...]
[-checksum [-v] <src> ...]
[-chgrp [-R] GROUP PATH...]
[-chmod [-R] <MODE[,MODE]... | OCTALMODE> PATH...]
[-chown [-R] [OWNER][:[GROUP]] PATH...]
[-concat <target path> <src path> <src path> ...]
[-copyFromLocal [-f] [-p] [-l] [-d] [-t <thread count>] [-q <thread pool queue size>] <localsrc> ... <dst>]
[-copyToLocal [-f] [-p] [-crc] [-ignoreCrc] [-t <thread count>] [-q <thread pool queue size>] <src> ... <localdst>]
[-count [-q] [-h] [-v] [-t [<storage type>]] [-u] [-x] [-e] [-s] <path> ...]
[-cp [-f] [-p | -p[topax]] [-d] [-t <thread count>] [-q <thread pool queue size>] <src> ... <dst>]
[-createSnapshot <snapshotDir> [<snapshotName>]]
[-deleteSnapshot <snapshotDir> <snapshotName>]
[-df [-h] [<path> ...]]
[-du [-s] [-h] [-v] [-x] <path> ...]
[-expunge [-immediate] [-fs <path>]]
[-find <path> ... <expression> ...]
[-get [-f] [-p] [-crc] [-ignoreCrc] [-t <thread count>] [-q <thread pool queue size>] <src> ... <localdst>]
[-getfacl [-R] <path>]
[-getfattr [-R] {-n name | -d} [-e en] <path>]
[-getmerge [-nl] [-skip-empty-file] <src> <localdst>]
[-head <file>]
[-help [cmd ...]]
[-ls [-C] [-d] [-h] [-q] [-R] [-t] [-S] [-r] [-u] [-e] [<path> ...]]
[-mkdir [-p] <path> ...]
[-moveFromLocal [-f] [-p] [-l] [-d] <localsrc> ... <dst>]
[-moveToLocal <src> <localdst>]
[-mv <src> ... <dst>]
[-put [-f] [-p] [-l] [-d] [-t <thread count>] [-q <thread pool queue size>] <localsrc> ... <dst>]
[-renameSnapshot <snapshotDir> <oldName> <newName>]
[-rm [-f] [-r|-R] [-skipTrash] [-safely] <src> ...]
[-rmdir [--ignore-fail-on-non-empty] <dir> ...]
[-setfacl [-R] [{-b|-k} {-m|-x <acl_spec>} <path>]|[--set <acl_spec> <path>]]
[-setfattr {-n name [-v value] | -x name} <path>]
[-setrep [-R] [-w] <rep> <path> ...]
[-stat [format] <path> ...]
[-tail [-f] [-s <sleep interval>] <file>]
[-test -[defswrz] <path>]
[-text [-ignoreCrc] <src> ...]
[-touch [-a] [-m] [-t TIMESTAMP (yyyyMMdd:HHmmss) ] [-c] <path> ...]
[-touchz <path> ...]
[-truncate [-w] <length> <path> ...]
[-usage [cmd ...]]
Generic options supported are:
-conf <configuration file> specify an application configuration file
-D <property=value> define a value for a given property
-fs <file:///|hdfs://namenode:port> specify default filesystem URL to use, overrides 'fs.defaultFS' property from configurations.
-jt <local|resourcemanager:port> specify a ResourceManager
-files <file1,...> specify a comma-separated list of files to be copied to the map reduce cluster
-libjars <jar1,...> specify a comma-separated list of jar files to be included in the classpath
-archives <archive1,...> specify a comma-separated list of archives to be unarchived on the compute machines
The general command line syntax is:
command [genericOptions] [commandOptions]
Usage: ozone fs [generic options] -setrep [-R] [-w] <rep> <path> ...
bash-4.4$ ozone fs -setrep -w 3 ofs://om/s3v/buck/key2
Replication 3 set: ofs://om/s3v/buck/key2
Waiting for ofs://om/s3v/buck/key2 ... done
Does
ozone fs -setrepwork recursively on vol/bucket/dir at the moment? If not, do we have plans to support it later?
Recursive operation is handled in SetReplication (the implementation of -setrep), so I guess we get that for free. FileSystem.setReplication, which is implemented here for Ozone, only handles files.
On the other hand, it looks like -setrep is async by default (-w makes it wait for replication to complete), which this implementation does not support. I guess that's part of what @jojochuang referred to here:
requires client to read from source and write to the destination. I think that wouldn't be expected for a user coming from Hadoop land
On the other hand, it looks like -setrep is async by default
Yes it is async in hadoop as the blocks are replicated/deleted according to the set replication factor (client only sends setrep request to NN) , Using rewrite is a hack here and if needed we could do it in a separate thread in order to be asynchronous but it is the client who will do the work of replicating not the server like HDFS.
If this is not desired, we could probably close this and leave the current behaviour as is.
Using rewrite is a hack here and if needed we could do it in a separate thread in order to be asynchronous but it is the client who will do the work of replicating not the server like HDFS. If this is not desired, we could probably close this and leave the current behaviour as is.
Alternatively, we could override the shell command from Hadoop, rejecting invocation without -w as "not implemented". This would make behavior consistent (since -w is forced), and let us add async implementation in the future. (Let me know if more details are needed.)
We need more robot tests. Please include every positive and negative test case for robot cli tests.
we could override the shell command from Hadoop, rejecting invocation without -w as "not implemented". This would make behavior consistent (since -w is forced), and let us add async implementation in the future.
Thanks @adoroszlai for the comment, I have made this change.
However this is at the command level i.e only would make sense if user is using ozone fs shell (also won't work for hadoop fs shell as we are overriding in ozone) but at the API level it is still same i.e non-async. Hadoop usecases generally involve using the hadoop fs provided API's and not FS shell , if it is okay there then we could go ahead with this patch.
Hadoop usecases generally involve using the hadoop fs provided API's and not FS shell
Thanks, I didn't know that.
For example, the mapreduce.robot file that this patch touches when executed calls this code and it makes a call to fs.setReplication()
Thanks again @sadanand48 for the patch. Given that:
- the same functionality is available via
ozone shCLI, which even supports rewriting EC keys - usage via FileSystem API has different behavior compared to Hadoop's async implementation
I suggest abandoning this.
Sure, closing this.