Improve the precision of our integration tests
We currently have a rather low precision when testing our pipeline due to due reasons.
-
- Our reference is an image and not a numpy array. This means that when we created our reference image we lost float precision which is unnecessary
-
- We only test for
.max() < 1e-2. IMO we should test for.max() < 1e-4with the numpy arrays. In my experiements across multiple devices I have not seen differences bigger than.max() < 1e-4when using full precision.
- We only test for
IMO this could have also prevented: https://github.com/huggingface/diffusers/issues/902
I'm going to start work on this right now
@patrickvonplaten after some research I think I understand the issue.
The images are currently stored in a low-precision format (e.g. png) which prevents us from testing with any greater precision than 1e-2. Even if we convert the image into a numpy array this will not help since the image itself is missing precision.
What we need to do is store a numpy representation of the image, say somewhere inside https://huggingface.co/datasets which we can then download and use for comparison.
The way we could do this is by generating the test output, saving that output as a numpy image and uploading it.
This is what I plan to do tomorrow.
This stack overflow thread was very helpful in understanding the issue.
Hey @Lewington-pitsos,
That's a very good observation - we came to the same conclusion here: https://github.com/huggingface/diffusers/issues/937 :-) Do you have an account on the Hugging Face Hub? Would you like to upload the numpy images to a dataset on the Hub maybe? :-) This would be super useful!
Hey, yes I am currently working on this in fact!
@patrickvonplaten I made a PR adding the files: https://huggingface.co/datasets/hf-internal-testing/diffusers-images/discussions/2
Thanks a mille @Lewington-pitsos for making this very important improvement-of-life PR! Leaving this issue open as we still need to apply the same changes to:
- https://github.com/huggingface/diffusers/blob/main/tests/pipelines/stable_diffusion/test_stable_diffusion_img2img.py and
- https://github.com/huggingface/diffusers/blob/main/tests/pipelines/stable_diffusion/test_stable_diffusion.py
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
Please note that issues that do not follow the contributing guidelines are likely to be ignored.