Decompression issue when creating new from buffer
I have an *.svs image made with an Aperio slide scanner, which apparently uses 33003 compression. I have no problem loading the image when reading directly from a file:
image = vips.Image.new_from_file(path)
region = image.crop(x, y, w, h)
avg = region.avg()
But if I try to load the image via buffer:
with open(path, 'rb') as f:
buffer = f.read()
image = vips.Image.new_from_buffer(buffer, "")
region = image.crop(x, y, w, h)
avg = region.avg()
I get the error message:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/shawarma/venv/lib/python3.8/site-packages/pyvips/vimage.py", line 919, in call_function
return pyvips.Operation.call(name, self, *args, **kwargs)
File "/home/shawarma/venv/lib/python3.8/site-packages/pyvips/voperation.py", line 282, in call
raise Error('unable to call {0}'.format(operation_name))
pyvips.error.Error: unable to call avg
source input: Compression scheme 33003 tile decoding is not implemented
Any ideas?
Could it be picking different loaders? I would try using openslideload instead of new_from_file, and openslideload_buffer instead of new_from_buffer.
Could it be picking different loaders? I would try using
openslideloadinstead ofnew_from_file, andopenslideload_bufferinstead ofnew_from_buffer.
There is no openslideload_buffer method, unless I'm missing something...
Ah! It's coming back to me.
Yes, openslide can only load images from the filesystem. For example, MRXS images are kept in quite a large directory tree and there isn't really a single object that could hold them. When you used new_from_buffer, libvips will have opened the image using the TIFF loader (since SCN images are a type of TIFF). The standard TIFF loader does not know about jp2k compression (what SCN uses internally).
The (slightly ugly) solution would be to write the buffer to a temporary file, and then open that.
Ah, that makes sense. I'm looking for a flexible method to load the compressed image into RAM before reading/decompressing. We have 10-20 TB of slides on slow hard drives and need to make many random reads one slide at a time. The random read speed of HDDs is a significant bottleneck, and I'm hoping that having the compressed image in RAM will help with that, as each slide can be read all at once from disc, with sequential reading being much faster than random reading.
Unfortunately I'm not really seeing a way to do this, apart from having a RAM disk set up for temporarily file storage, but it's not a flexible and easily distributable solution.