webdataset icon indicating copy to clipboard operation
webdataset copied to clipboard

Are there plans to support WebP?

Open CS123n opened this issue 2 years ago • 1 comments

sink = wds.TarWriter('test.tar')
image_byte = io.BytesIO()
image = Image.fromarray(numpy.uint8(torch.randint(0, 256, (256, 256, 3))))
image.save(image_byte, format='webp')
sink.write({
    '__key__': 'sample001',
    'txt': 'Are you ok?',
    'jpg': torch.rand(256, 256, 3).numpy(),
    'webp': image_byte.getvalue()
})

For now, I have to transfer it by myself, while jpg and png can be stored directly.

CS123n avatar Apr 13 '23 07:04 CS123n

You can just add a handler to webdadaset.writer.default_handlers, or you can completely override the encoder in TarWriter.

ImageIO seems to support it natively, so I may add it by default.

(This needs to be documented better and perhaps needs a nicer API.)

tmbdev avatar Apr 28 '23 20:04 tmbdev