MemoryError: Not enough memory to fit the whole sample
Hello!
I'm having trouble writing my dataset.
The batches are already made as numpy arrays, so I just have to feed them through a dataset so it can be written to .beton. Each batch has 4 numpy arrays (see field dictionary below), and all batches are in a dictionary called batches with indices as keys. batches[n] would return a tuple of the 4 numpy arrays. I even stack them by field to resemble to LinearRegressionDataset example in the docs. i.e, I do batches = [np.stack([batches[i][j] for i in range(len(batches))]) for j in range(4)], just in case.
My dataset class is:
class Dataset:
def __init__(self,batches):
self.raw = batches[0]
self.labels_mask = batches[1]
self.gt_affs = batches[2]
self.affs_weights = batches[3]
def __len__(self):
return len(self.raw)
def __getitem__(self,index):
return (self.raw[index],self.labels_mask[index],self.gt_affs[index],self.affs_weights[index])
Here is my writer and write command:
writer = DatasetWriter(os.path.join(data_dir,'test.beton'), {
'raw': NDArrayField(shape=(1, 48, 196, 196), dtype=np.dtype('float32')),
'labels_mask': NDArrayField(shape=(1, 28, 104, 104), dtype=np.dtype('uint8')),
'gt_affs': NDArrayField(shape=(3, 28, 104, 104), dtype=np.dtype('uint8')),
'affs_weights': NDArrayField(shape=(1, 3, 28, 104, 104), dtype=np.dtype('float32')),
}, num_workers=1)
writer.from_indexed_dataset(dataset,chunksize=1)
This always results in the follow traceback and hangs until I ctrl+c:
Traceback (most recent call last):
File "/scratch1/04101/vvenu/miniconda3/envs/ffcv/lib/python3.9/site-packages/ffcv/writer.py", line 113, in worker_job_indexed_dataset
handle_sample(sample, dest_ix, field_names, metadata, allocator, fields)
File "/scratch1/04101/vvenu/miniconda3/envs/ffcv/lib/python3.9/site-packages/ffcv/writer.py", line 51, in handle_sample
field.encode(destination, field_value, allocator.malloc)
File "/scratch1/04101/vvenu/miniconda3/envs/ffcv/lib/python3.9/site-packages/ffcv/fields/ndarray.py", line 98, in encode
destination[0], data_region = malloc(self.element_size)
File "/scratch1/04101/vvenu/miniconda3/envs/ffcv/lib/python3.9/site-packages/ffcv/memory_allocator.py", line 65, in malloc
raise MemoryError("Not enough memory to fit the whole sample")
MemoryError: Not enough memory to fit the whole sample
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/scratch1/04101/vvenu/miniconda3/envs/ffcv/lib/python3.9/multiprocessing/process.py", line 315, in _bootstrap
self.run()
File "/scratch1/04101/vvenu/miniconda3/envs/ffcv/lib/python3.9/multiprocessing/process.py", line 108, in run
self._target(*self._args, **self._kwargs)
File "/scratch1/04101/vvenu/miniconda3/envs/ffcv/lib/python3.9/site-packages/ffcv/writer.py", line 117, in worker_job_indexed_dataset
done_number.value += len(chunk)
File "/scratch1/04101/vvenu/miniconda3/envs/ffcv/lib/python3.9/site-packages/ffcv/memory_allocator.py", line 119, in __exit__
self.flush_page()
File "/scratch1/04101/vvenu/miniconda3/envs/ffcv/lib/python3.9/site-packages/ffcv/memory_allocator.py", line 84, in flush_page
assert self.page_offset != 0
AssertionError
^CTraceback (most recent call last):
File "/scratch1/04101/vvenu/autoseg/cremi/02_train/multi_gpu_test/mkdata_copy.py", line 295, in <module>
writer.from_indexed_dataset(dataset,chunksize=1)
File "/scratch1/04101/vvenu/miniconda3/envs/ffcv/lib/python3.9/site-packages/ffcv/writer.py", line 297, in from_indexed_dataset
self._write_common(len(indices), chunks(indices, chunksize),
File "/scratch1/04101/vvenu/miniconda3/envs/ffcv/lib/python3.9/site-packages/ffcv/writer.py", line 255, in _write_common
sleep(0.1)
KeyboardInterrupt
0%| | 0/10 [04:29<?, ?it/s]^C
(ffcv) c196-011[rtx](1064)$ /scratch1/04101/vvenu/miniconda3/envs/ffcv/lib/python3.9/multiprocessing/resource_tracker.py:216: UserWarning: resource_tracker: There appear to be 1 leaked shared_memory objects to clean up at shutdown
warnings.warn('resource_tracker: There appear to be %d '
Each batch is about 10mb.
Appreciate any help! Thank you so much!! Cheers on the project, would love to get it working for my case :)
@yajivunev Were you able to solve this? Currently struggling with the same issue. Originally I would get a page size error, but after ramping up the page size I get the same error as you, even though a single sample should easily fit into my memory.
Update: I was able to resolve this issue by further increasing the page size even more