cog-stable-diffusion
cog-stable-diffusion copied to clipboard
Reducing inference timings for Sd2.1 Base model
I managed to shave off inference timings for SD2.1 by a few seconds for 512x512 (50 steps) and 768x768 (50 Steps).
Using just few additions:
torch.backends.cudnn.benchmark = True
torch.backends.cuda.matmul.allow_tf32 = True
pipe = StableDiffusionPipeline.from_pretrained(
MODEL_ID,
cache_dir=MODEL_CACHE,
local_files_only=True,
)
pipe = pipe.to("cuda")
pipe.enable_xformers_memory_efficient_attention()
pipe.enable_vae_slicing()
Overall output didn't suffer coz of this. Getting crisp images. Wanted to know how do I create a PR to add these? And are there any tests around this?
Here are the inferences:
- OG stability-ai 512x512 50 Steps - 5 secs
- OG stability-ai 768x768 50 Steps - 14.3 secs
- pratos sd2.1 512x512 50 Steps - 3.3 secs
-
pratos sd2.1 768x768 50 Steps - 10.6 secs
Model in question: https://replicate.com/pratos/stable-diffusion-2-1-512