New Pipeline: Tiled-upscaling with depth perception to avoid blurry spots
Hey AI Trainers and other peculiarities.
This is my first PR to Diffusers (I think, at least the first with an actual new feature). It's an upscaling feature that trades VRAM for compute and allows virtually infinite upscaling, being able to go as big as 8k in a matter of minutes on a single 3080 with little noticeable artifacts.
Started as a research project with its own model, I managed to port this feature to Stable Diffusion 2 upscaling, with very cool results in return.
It was first introduced in my own Discord bot as a testing feature, but in the meantime I learned how pipelines worked and to my best knowledge I tried to make it available for Diffusers.
This is my first pipeline, so please be gentle with reviews and feedback, I'm willing to learn to contribute in a more standard fashion, so I'll apply any feedback that comes from other people that makes sense.
The main thing I'm uncertain about is that the contribution guidelines state:
Self-contained: A pipeline shall be as self-contained as possible. More specifically, this means that all functionality should be either directly defined in the pipeline file itself, should be inherited from (and only from) the DiffusionPipeline class or be directly attached to the model and scheduler components of the pipeline.
Considering my code borrows a lot from the original upscaling code, I could either copy the code and add my features from there, or simply refer to pipeline_stable_diffusion_upscale. I decided to do the latter it was less clunky and faster to do.
Because of the aforementioned lack of skill in contributing pipelines, as well as possible changes regarding the pipeline's "self-contained-ness", I consider this PR an ongoing discussion.
The pipeline contains a __main__ entry point that can be called through CLI for a demo. The example code is:
model_id = "stabilityai/stable-diffusion-x4-upscaler"
pipe = StableDiffusionTiledUpscalePipeline.from_pretrained(model_id, revision="fp16", torch_dtype=torch.float16)
pipe = pipe.to("cuda")
image = Image.open("../../docs/source/imgs/diffusers_library.jpg")
def callback(obj):
print(f"progress: {obj['progress']:.4f}")
obj['image'].save("diffusers_library_progress.jpg")
final_image = pipe(image=image, prompt="Black font, white background, vector", noise_level=40, callback=callback)
final_image.save("diffusers_library.jpg")
I'm looking forward to feedback, and I hope I made something that could benefit others, too.
With that, let's take a look at some demo art:
Upscaled docs/source/imgs/diffusers_library.jpg:





And my favorite and the first ever made with this algorithm, a grilled dragon:

The documentation is not available anymore as the PR was closed or merged.
Hey @peterwilli,
This looks super cool! Would you mind maybe adding your pipeline to the official table and a pipeline example here: https://github.com/huggingface/diffusers/blob/main/examples/community/README.md#community-examples
This would greatly help the community to use your pipeline :-)
Hey @peterwilli,
This looks super cool! Would you mind maybe adding your pipeline to the official table and a pipeline example here: https://github.com/huggingface/diffusers/blob/main/examples/community/README.md#community-examples
This would greatly help the community to use your pipeline :-)
Hey @patrickvonplaten, thanks for the kind words! I'm currently in the process of making these examples, but I'm stuck with how to preload my pipeline. I'm getting strange errors about shutil and I'm wondering if you can help me out.
I have a colab here: https://colab.research.google.com/drive/1Zlvi64ZkQUarqiAFzyywPRSLUogbH8Xd?usp=sharing
Thanks in advance.
Hey @peterwilli,
Sure, I think if you want to use the "native" upscaler pipeline you can just do:
diffuser_pipeline = StableDiffusionUpscalePipeline.from_pretrained(
"stabilityai/stable-diffusion-x4-upscaler",
torch_dtype=torch.float16,
)
instead of:
diffuser_pipeline = StableDiffusionUpscalePipeline.from_pretrained(
"stabilityai/stable-diffusion-x4-upscaler",
custom_pipeline="./diffusers/src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_upscale.py",
revision="fp16",
torch_dtype=torch.float16,
)
diffuser_pipeline = StableDiffusionUpscalePipeline.from_pretrained( "stabilityai/stable-diffusion-x4-upscaler", torch_dtype=torch.float16, )
But this doesn't use the pipeline then, right? I feel I misinterpreted how to apply this. Thanks for the help by the way!
@patrickvonplaten sorry for the ping, I was wondering if you saw it... And happy new year (in 1 day!)
Happy new year @peterwilli - sorry for being so late here!