Parallelize compression
Do block compression on a thread pool, writing files as they complete.
I'm assuming here that compression is the expensive operation we want to parallelize, but depending on the observed balance of time possibly hashing could be parallelized too.
Since the index is sorted we need to accept completed blocks in order by the filename that includes them. So this is complementary to, and related to, storing content from multiple files in a single block. Although we know the hash in advance, we shouldn't write the index until the block data has been written.
Because blocks are of limited size we can keep them entirely in memory, and we can just transfer ownership of the block to the worker thread.
Run ~n_cpus compression worker threads.
One main thread reads files, hashes them, accumulates data into blocks. Push the blocks onto a queue for compression, along with a channel through which the worker can indicate that it's complete.
The compression worker thread returns a Result<()>.
pseudocode:
pending_files = Deque()
for file in source:
while queue.length > N:
pending_files[0].channels[0].wait()
while pending_files[0].channels.all_complete():
index.add(pending_files.pop(0))
file_completion_channels = []
for block in pack_blocks_for_file:
file_completion_channels.append(block_channel = channel())
queue.push(block_bytes, block_channel)
pending_files.push(file, file_completion_channels)
Switching to Snappy compression has made compression fast enough that this is not a priority.
I made some progress towards this in 1943157b4c8eb565bbe241e6f682aac60619755e, and it showed a substantial speedup for some cases. There's probably a lot of scope for further parallelization in Rust, but it might require a fuller rewiring to use futures or something, and they're still a bit in flux in Rust. So this can wait until more features are complete.