CacheLib icon indicating copy to clipboard operation
CacheLib copied to clipboard

Binary request generator/replayer

Open byrnedj opened this issue 1 year ago • 6 comments

This is the binary trace replayer/generator that we have been using to achieve max CPU utilization for the kvcache traces in cachebench. With this generator, we can achieve a throughput of over 20 million op/sec using kvcache workload in cachebench. As a comparison, using the CSV replay generator we see only ~1.6 million op/sec due to dynamic allocations and parsing overhead.

We avoid allocations by mmap'ing the request data into memory and using a Request pointer to point to the request data rather than allocating a new request wrapper for each request.

To generate a binary request file from an existing kvcache trace (using the "replay" generator).

  1. Specify the kvcache trace name using the regular traceFileNames or traceFileName option. Specify other properties such as ampFactor too.
  2. In the replayGeneratorConfig, specify binaryFileName: "mybinaryfile.bin as a config option
  3. Run cachebench and wait for the binary file to be generated

To run a binary request trace specify the following:

  1. Set generator to "binary-replay"
  2. Set traceFileName: "mybinaryfile.bin" and set ampSizeFactor (if desired)

In summary - this patch offers much lower overhead of trace replaying. It does assumes the kvcache trace format and kvcache replay generator behavior. Additional features:

  • fast forwarding of a trace
  • preloading requests into memory
  • object size amplification
  • queue free for even lower request overhead

The limitations are:

  • no trace amplification (however you can amplify the original .csv trace and save it in binary format)
  • ~4GB overhead per 100 million requests
  • you need some disk space to store large traces

byrnedj avatar Apr 22 '24 21:04 byrnedj

@therealgymmy has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

facebook-github-bot avatar Jun 17 '24 23:06 facebook-github-bot

@byrnedj: tested it out internally and verified 10x throughput improvement. The binary trace currently does not repeat if the specified operations are longer than the trace lenght, is this intended?

therealgymmy avatar Jul 19 '24 19:07 therealgymmy

@byrnedj has updated the pull request. You must reimport the pull request before landing.

facebook-github-bot avatar Jul 22 '24 19:07 facebook-github-bot

I just added that functionality to the latest version.

byrnedj avatar Jul 22 '24 19:07 byrnedj

Thanks let me re-import again.

therealgymmy avatar Jul 23 '24 15:07 therealgymmy

@therealgymmy has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

facebook-github-bot avatar Jul 23 '24 15:07 facebook-github-bot

@therealgymmy merged this pull request in facebook/CacheLib@253107481b6cff7e5d70fc54fce075bca2c463dc.

facebook-github-bot avatar Aug 29 '24 19:08 facebook-github-bot