ozone icon indicating copy to clipboard operation
ozone copied to clipboard

HDDS-10845. BaseFreonGenerator allows an empty prefix

Open whbing opened this issue 1 year ago • 5 comments

What changes were proposed in this pull request?

BaseFreonGenerator allows an empty prefix instead of enforcing a random prefix when no prefix is specified.

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-10845

How was this patch tested?

manual tests.

case1:

without pr:

# note: write first and then exe related read op
$ ozone freon ockrw -r 1000 -t 10 --linear --contiguous --percentage-read 50 -m=true --size=0 --duration 10s -v vol1 -b freon2
2024-05-12 16:54:45,789 [pool-2-thread-9] ERROR freon.OzoneClientKeyReadWriteListOps: Key:o7h00eovbs/9 not found
2024-05-12 16:54:45,789 [pool-2-thread-2] ERROR freon.OzoneClientKeyReadWriteListOps: Key:o7h00eovbs/5 not found
2024-05-12 16:54:45,789 [pool-2-thread-1] ERROR freon.OzoneClientKeyReadWriteListOps: Key:o7h00eovbs/7 not found
2024-05-12 16:54:45,789 [pool-2-thread-5] ERROR freon.OzoneClientKeyReadWriteListOps: Key:o7h00eovbs/0 not found
2024-05-12 16:54:45,791 [pool-2-thread-6] ERROR freon.BaseFreonGenerator: Error on executing task 5
java.lang.RuntimeException: Key:o7h00eovbs/6 not found
    at org.apache.hadoop.ozone.freon.OzoneClientKeyReadWriteListOps.lambda$readWriteListKeys$0(OzoneClientKeyReadWriteListOps.java:212)
    at com.codahale.metrics.Timer.time(Timer.java:116)
    at org.apache.hadoop.ozone.freon.OzoneClientKeyReadWriteListOps.readWriteListKeys(OzoneClientKeyReadWriteListOps.java:192)
    at org.apache.hadoop.ozone.freon.BaseFreonGenerator.tryNextTask(BaseFreonGenerator.java:220)
    at org.apache.hadoop.ozone.freon.BaseFreonGenerator.taskLoop(BaseFreonGenerator.java:200)
    at org.apache.hadoop.ozone.freon.BaseFreonGenerator.lambda$startTaskRunners$0(BaseFreonGenerator.java:174)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748) 

with pr, no err:

$ ozone freon ockrw -r 1000 -t 10 --linear --contiguous --percentage-read 50 -m=true --size=0 --duration 10s -v vol1 -b freon2
2024-05-12 17:42:51,677 [main] INFO freon.BaseFreonGenerator: Executing test with prefix '' and number-of-tests 1000

 100.00% |█████████████████████████████████████████████████████████████████████████████████████████████████████|  10/10 Time: 0:00:10|
24-5-12 17:43:02 ===============================================================

-- Timers ----------------------------------------------------------------------
key-read-write-list
             count = 6073
         mean rate = 671.91 calls/second
     1-minute rate = 630.00 calls/second
     5-minute rate = 630.00 calls/second
    15-minute rate = 630.00 calls/second
               min = 0.29 milliseconds
               max = 223.48 milliseconds
              mean = 12.61 milliseconds
            stddev = 30.34 milliseconds
            median = 1.82 milliseconds
              75% <= 2.95 milliseconds
              95% <= 90.20 milliseconds
              98% <= 93.20 milliseconds
              99% <= 94.80 milliseconds
            99.9% <= 223.33 milliseconds


Total execution time (sec): 10
Failures: 0
Successful executions: 6073

case 2:

without pr:

ozone freon ommg --operation CREATE_KEY -n 25000 --duration 10 -v vol1 -b freon2
2024-05-12 15:22:35,976 [main] INFO freon.BaseFreonGenerator: Executing test with prefix hrjrsaohi9 and number-of-tests 25000

"prefix hrjrsaohi9" should be prefix ''. with pr:

2024-05-12 17:45:36,898 [main] INFO freon.BaseFreonGenerator: Executing test with prefix '' and number-of-tests 25000

whbing avatar May 12 '24 09:05 whbing

@adoroszlai @xichen01 Could you help review if you have time ? Thanks !

whbing avatar May 12 '24 14:05 whbing

@whbing Thanks for your PR.

I think we can add a --noprefix option to BaseFreonGenerator (default value is false) and set a different default value in some subclass if it is necessary.

xichen01 avatar May 13 '24 05:05 xichen01

Thanks @whbing for working on this. Unfortunately, I don't think using an empty prefix will solve this.

  • Read-only workload will never work without prior writes. (There are standalone read-only Freon subcommands, e.g. ockv, dfsv, ocokr)
  • Mixed workload may still run into read errors depending on the progress of read/write threads. (I tried your command and did get Key:... not found errors.)

The result of using an empty prefix is the same as using a fixed non-empty prefix (e.g. -p test).

adoroszlai avatar May 13 '24 08:05 adoroszlai

@adoroszlai Thanks for review.

The result of using an empty prefix is the same as using a fixed non-empty prefix (e.g. -p test).

The prefix is always non-empty due to the following code, regardless of whether the prefix used.

https://github.com/apache/ozone/blob/06c7cb419c141a0525c3d91063517b9912c9ff70/hadoop-ozone/tools/src/main/java/org/apache/hadoop/ozone/freon/BaseFreonGenerator.java#L287-L292

  • It does not impact write op even with a random prefix
  • For read op, it always randomly selects an incorrect prefix.

It seems to require all sub-cmds forced to use prefix. I prefer to allow empty prefixes (e.g. freon ommg).

whbing avatar May 13 '24 13:05 whbing

The prefix is always non-empty due to the following code, regardless of whether the prefix used.

https://github.com/apache/ozone/blob/06c7cb419c141a0525c3d91063517b9912c9ff70/hadoop-ozone/tools/src/main/java/org/apache/hadoop/ozone/freon/BaseFreonGenerator.java#L287-L292

  • It does not impact write op even with a random prefix
  • For read op, it always randomly selects an incorrect prefix.

It seems to require all sub-cmds forced to use prefix. I prefer to allow empty prefixes (e.g. freon ommg).

@whbing Let me clarify my comment. I think allowing empty prefix is useful to solve problem 2 (random prefix is logged, but not used), but it will not fix problem 1 (reader cannot find data).

adoroszlai avatar May 13 '24 14:05 adoroszlai

but it will not fix problem 1 (reader cannot find data).

@adoroszlai It can fix problem 1 if empty prefix allowed when writing (before read op), as following script:

# test 1 : without pr
ozone sh bucket create vol1/freon1
ozone freon ockrw -r 1000 -t 10 -m=true --linear --contiguous --duration 10s -v vol1 -b freon1
# mix w/r Failed : Key not found
ozone freon ockrw -r 100 -t 10 -m=true --linear --contiguous --percentage-read 50 --duration 10s -v vol1 -b freon1

# test 2 : with pr
ozone sh bucket create vol1/freon2
ozone freon ockrw -r 1000 -t 10 -m=true --linear --contiguous --duration 10s -v vol1 -b freon2
# mix w/r Successful
ozone freon ockrw -r 100 -t 10 -m=true --linear --contiguous --percentage-read 50 --duration 10s -v vol1 -b freon2

It isn't a big issue itself. My rationale for the change is that maybe empty prefix could potentially offer better performance in the FSO bucket.

whbing avatar May 14 '24 15:05 whbing

It can fix problem 1 if empty prefix allowed when writing (before read op)

So the problem is worked around by executing write-only workload before read or read/write workload. That works if the subsequent workloads use the same prefix, empty or not. (BTW, in both cases it requires the write workload to create at least as many items as the read workload will try to read.)

I'm not opposed to this change, but stating it fixes problem 1 may be confusing. Users are still likely to run into the problem by not being aware of the need for the initial write-only workload.

adoroszlai avatar May 14 '24 15:05 adoroszlai

Users are still likely to run into the problem by not being aware of the need for the initial write-only workload.

There doesn't seem to be a good way to handle this. Perhaps it would be better to pre-check the existence of the path ( vol/buk/prefix) before running read workload, rather than endlessly logging a large number of Key not found.

whbing avatar May 14 '24 16:05 whbing

Thanks @xichen01 for review, thanks @adoroszlai for review and merge !

whbing avatar May 15 '24 02:05 whbing