streaming
streaming copied to clipboard
Distributed Key Value Tensor Store
Is it possible to use streaming dataset as a distributed key value store?
i have a set of keys (strings like "xyz_123") each that correspond to an numpy array
ideally I can do something like
np_array = dataset["xyz_123"]
but i see with MDSWriter.write that the keys of the dataset are just sequential and i can't change them.
Is there a way to have a custom key for MDSWriter?
Hi @OrenLeung, what is the size of the dataset and how many unique keys you have in the dataset?
@karan6181 the size is about 1 TB and about 100k unique keys