[df-if II] add additional input checks to ensure the input is divisible by 8
What does this PR do?
Fixes #7842
Adds logic to check_inputs for the IF SuperResolution pipeline so that the user receives a clear error when attempting to run the pipeline with invalid image sizes for the input.
This is possible to hit when using the super-resolution model for upscaling evaluation images during training, if eg. the target 256 pixel resolution is aligned to 8px intervals and then divided by 4 to obtain the input image size. The stage II output resolution will be okay, but the input resolution would be wrong.
I suppose there's other ways to hit the problem, but it's always been a bit murky which input is causing the problems.
Before submitting
- [ ] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
- [x] Did you read the contributor guideline?
- [x] Did you read our philosophy doc (important for complex PRs)?
- [x] Was this discussed/approved via a GitHub issue or the forum? Please add a link to it if that's the case.
- [ ] Did you make sure to update the documentation with your changes? Here are the documentation guidelines, and here are tips on formatting docstrings.
- [x] Did you write any new necessary tests?
Who can review?
- Pipelines: @sayakpaul @yiyixuxu @DN6
ohh thanks for looking into this!
What we usually do with our image inputs is that in the preprocessing step we resize it to default height and width that are divisible by 8 https://github.com/huggingface/diffusers/blob/58237364b1780223f48a80256f56408efe7b59a0/src/diffusers/image_processor.py#L407
so I think instead of adding the checks, we should just resize it, we can either adding the resize step to the preprocess_image method for the IF pipeline, or we can just refactor the method with the VaeImageProcessor like what we do in rest of the pipelines https://github.com/huggingface/diffusers/blob/58237364b1780223f48a80256f56408efe7b59a0/src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_img2img.py#L292
what do you think?
i considered it. but because of the nature of this, i didn't really feel comfortable just squishing images on the users' behalf. with the small resolution of the inputs, it really can be noticeably distorted, whereas with SD and SDXL at 512/768/1024px it's far less destructive to adjust the size @yiyixuxu how do you feel about that mindset applied to a 64px model, where it might be somewhere around ~5-7% of the image size we end up adjusting by?
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.
@yiyixuxu i notice the quality checks failed because of some unnecessary list comprehension. but when i look at it, it seems like the most reasonable way to do it? is there a better way? i would love to learn 😁
can you run make style again?
@yiyixuxu done
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
Please note that issues that do not follow the contributing guidelines are likely to be ignored.