What does this PR do?

Fixes #7842

Adds logic to check_inputs for the IF SuperResolution pipeline so that the user receives a clear error when attempting to run the pipeline with invalid image sizes for the input.

This is possible to hit when using the super-resolution model for upscaling evaluation images during training, if eg. the target 256 pixel resolution is aligned to 8px intervals and then divided by 4 to obtain the input image size. The stage II output resolution will be okay, but the input resolution would be wrong.

I suppose there's other ways to hit the problem, but it's always been a bit murky which input is causing the problems.

Before submitting

[ ] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
[x] Did you read the contributor guideline?
[x] Did you read our philosophy doc (important for complex PRs)?
[x] Was this discussed/approved via a GitHub issue or the forum? Please add a link to it if that's the case.
[ ] Did you make sure to update the documentation with your changes? Here are the documentation guidelines, and here are tips on formatting docstrings.
[x] Did you write any new necessary tests?

Who can review?

Pipelines: @sayakpaul @yiyixuxu @DN6

May 02 '24 16:05 bghira

ohh thanks for looking into this! What we usually do with our image inputs is that in the preprocessing step we resize it to default height and width that are divisible by 8 https://github.com/huggingface/diffusers/blob/58237364b1780223f48a80256f56408efe7b59a0/src/diffusers/image_processor.py#L407

so I think instead of adding the checks, we should just resize it, we can either adding the resize step to the preprocess_image method for the IF pipeline, or we can just refactor the method with the VaeImageProcessor like what we do in rest of the pipelines https://github.com/huggingface/diffusers/blob/58237364b1780223f48a80256f56408efe7b59a0/src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_img2img.py#L292

what do you think?

May 03 '24 20:05 yiyixuxu

i considered it. but because of the nature of this, i didn't really feel comfortable just squishing images on the users' behalf. with the small resolution of the inputs, it really can be noticeably distorted, whereas with SD and SDXL at 512/768/1024px it's far less destructive to adjust the size @yiyixuxu how do you feel about that mindset applied to a 64px model, where it might be somewhere around ~5-7% of the image size we end up adjusting by?

May 03 '24 22:05 bghira

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

May 06 '24 05:05 HuggingFaceDocBuilderDev

@yiyixuxu i notice the quality checks failed because of some unnecessary list comprehension. but when i look at it, it seems like the most reasonable way to do it? is there a better way? i would love to learn 😁

May 07 '24 01:05 bghira

can you run make style again?

May 08 '24 21:05 yiyixuxu

@yiyixuxu done

May 11 '24 13:05 bghira

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

Sep 14 '24 15:09 github-actions[bot]

[df-if II] add additional input checks to ensure the input is divisible by 8

What does this PR do?

Before submitting

Who can review?