kv/bulk: enable async flushing
This change re-enables async flushing in the SST batcher. Async flushing is a throughput win for large (1 Tib+) out of order writers. This occurs during index builds, out of order imports, and PCR.
Previously, async flushing was a special case in the batcher. This change reworks async flushing so that all flushes are async. This ensures that context handling and error handling are consistent. This should reduce the probability of an async only bug in the sst writer that is hard to exercise.
Fixes: https://github.com/cockroachdb/cockroach/issues/146828
Your pull request contains more than 1000 changes. It is strongly encouraged to split big PRs into smaller chunks.
It looks like your PR touches production code but doesn't add or edit any test code. Did you consider adding tests to your PR?
:owl: Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf.
It looks like your PR touches production code but doesn't add or edit any test code. Did you consider adding tests to your PR?
:owl: Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf.
Thanks for the review!
bors r+
There in .Close() that was caught by a unit test. If Close was called while there were in-flight flushes, it would wait for the flushes to complete, but it leaked the flush span. This was fixed by calling syncFlush() in .Close() instead of waiting for the err group.
I also added a regression test (TestSSTBatcherCloseWithoutFlush) that tests what happens when .Close() is called with in flight flushes. When writing the test I discovered a subtle difference between test clusters and test servers. Test clusters include validation to ensure no spans are leaked. Test servers do not include this validation.
bors r+