ValueError: cannot mmap an empty file
When I want to view the shape of train.features.mmap, numpy reports an error. How can I solve this problem
By the way, can I directly use the mmap file (such as train/valid/test. features.mmap) as the video feature, for example, save it as an .npy file for multimodal training
thank you
Because I didn't download the complete image compression package, I want to know how many images there are in the training set, verification set and test set respectively
@xiang-xiang-zhu Hi, I guess you are not familiar with mmap format.
We choose to use mmap instead of npy because .npy will load np.array to memory, but our feature file is too
big.
If you want to know the number of images without downloading them, you can try to download text first, and each sentence in text should be paired with an image.
Now I want to train with my own model. Can I directly use MMAP to read the data in the image part? Is the shape (image_num,feature_ dim)?
@xiang-xiang-zhu Yes, please refer to our corresponding code.
@xiang-xiang-zhu Yes, please refer to our corresponding code.
thank you very much I would like to ask whether [0, 1, 2] in the jsonl file represents a series of conversations from the 0 to the 2 sentences. 1 is the response of 0 and 2 is the response of 1. The conversations in [3,4,5] have nothing to do with [0,1,2] during training
np.memmap(feature_file(data_dir, split), dtype='float32', mode='r',shape=(self.total_num, self.dim))
By the way, when I read the mmap file with this code, does the array subscript represent the picture subscript? For example, after reading train.featrues.mmap, is the nth element in the read array the feature of the nth training picture
@xiang-xiang-zhu Yes, both of your comments are right.