DALI icon indicating copy to clipboard operation
DALI copied to clipboard

Question: ensure minimum image dimensions

Open dbpprt opened this issue 3 years ago • 2 comments

I'm currently trying to resize an image only if it is smaller than a given HxW.

I expected the following code to only resize the image if it is smaller than size but it also resizes the given input if it is larger than the given size.

x = fn.resize(
            x,
            size=512,
            max_size=2048,
            mode="not_smaller",
            dtype=dali.types.UINT8,
            interp_type=types.DALIInterpType.INTERP_CUBIC,
            device=device,
        )

Is there a way to achieve that?

Thanks!

dbpprt avatar Feb 27 '22 20:02 dbpprt

This seems to work, beside it not being very elegant:

# square pad
shapes = fn.shapes(y, dtype=types.INT32)
h = fn.slice(shapes, 0, 1, axes=[0])
w = fn.slice(shapes, 1, 1, axes=[0])
side = dali.math.max(h, w)
x, y = fn.pad([x, y], axis_names="HW", shape=fn.cat(side, side, axis=0))

# ensure the minimum size of the image e.g. 512
min_size = fn.cast(dali.math.max(side, 512), dtype=types.FLOAT)

x, y = fn.resize(
    [x, y],
    size=fn.cat(min_size, min_size, axis=0),
    mode="not_smaller",
    dtype=dali.types.UINT8,
    interp_type=types.DALIInterpType.INTERP_CUBIC,
    device=device,
)

dbpprt avatar Feb 27 '22 21:02 dbpprt

Hello @dennisbappert

You don't have to pad to square - you can still employ resize to maintain aspect ratio (if that's what you want). Also, in case your data is already on GPU, you won't be able to use fn.shapes, but if you happen to have the images in an encoded form, you can look up the shape like this:

    jpegs, labels = fn.readers.file(files=[jpeg_file], file_root=test_data_root)
    shapes = fn.peek_image_shape(jpegs)[0:2]  # get the shape from the raw file and select HW
    shapes = dali.math.max(shapes, 512.0)

    images = fn.decoders.image(jpegs, device="mixed")
    
    resized = fn.resize(images, interp_type=types.DALIInterpType.INTERP_CUBIC, size=shapes, mode="not_smaller")

If you have your data on the CPU, you can use fn.shapes

    shapes = fn.shapes(images)[0:2]  # get the shape from the raw file and select HW
    shapes = dali.math.max(shapes, 512.0)
   
    resized = fn.resize(images, interp_type=types.DALIInterpType.INTERP_CUBIC, size=shapes, mode="not_smaller")

mzient avatar Feb 28 '22 11:02 mzient