ffcv
ffcv copied to clipboard
shard large .beaton file into few shards
Hi,
I have two related questions:
- Is it possible to concatenate multiple .beaton files? that way one can distribute the workload of creating .beaton files into multiple processes.
- In the same line, can we concatenate new samples into an existing .beaton file or should I create a new file from scratch (which is quite inefficient)?
have the same question! would appreciate any help.
any update on this issue? is there a way to load multiple .beaton files into a single dataloader?
What's the benefit of using multiple beton files?
- the workload of creating files can be distributed among multiple processes which is faster and safer (in case of a fault in one of the processes)
- provide flexibility to mix multiple types of data sources
- simple way to append new samples, see #266 #325