adds the pipeline for pixart alpha controlnet
this PR adds the controlnet pipeline for the pixart alpha diffusion model
the following example uses the HED edge to control the generation.
import torch
import torchvision.transforms as T
import torchvision.transforms.functional as TF
from diffusers.models import PixArtControlNetAdapterModel
from diffusers.pipelines import PixArtAlphaControlnetPipeline, get_closest_hw
import PIL.Image as Image
from controlnet_aux import HEDdetector
input_image_path = "asset/images/controlnet/car.jpg"
given_image = Image.open(input_image_path)
path_to_controlnet = "raulc0399/pixart-alpha-hed-controlnet"
prompt = "modern car, city in background, clear sky, suny day"
weight_dtype = torch.float16
image_size = 1024
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
controlnet = PixArtControlNetAdapterModel.from_pretrained(
path_to_controlnet,
torch_dtype=weight_dtype,
use_safetensors=True,
).to(device)
pipe = PixArtAlphaControlnetPipeline.from_pretrained(
"PixArt-alpha/PixArt-XL-2-1024-MS",
controlnet=controlnet,
torch_dtype=weight_dtype,
use_safetensors=True,
).to(device)
# preprocess image, generate HED edge
hed = HEDdetector.from_pretrained("lllyasviel/Annotators")
width, height = get_closest_hw(given_image.size[0], given_image.size[1], image_size)
condition_transform = T.Compose([
T.Lambda(lambda img: img.convert('RGB')),
T.Resize(int(min(height, width))),
T.CenterCrop([int(height), int(width)]),
T.ToTensor()
])
control_image = condition_transform(control_image)
hed_edge = hed(control_image, detect_resolution=image_size, image_resolution=image_size)
with torch.no_grad():
out = pipe(
prompt=prompt,
image=hed_edge,
num_inference_steps=14,
guidance_scale=4.5,
height=image_size,
width=image_size,
)
out.images[0].save(f"./output.jpg")
here some images: original image, control image and generated image
Who can review?
@yiyixuxu @lawrence-cj
is this the checkpoint? https://huggingface.co/PixArt-alpha/PixArt-ControlNet I don't see any downloads, not sure if it's tracking correctly
is this pixart alpha controlnet used a lot in the community? if not, maybe we can make a community pipeline to start with?
also cc @asomoza
@yiyixuxu that is the pixart controlnet model for HED conditioning as uploaded by the authors of pixart. for this pipeline i have converted the controlnet layers to safetensors, uploaded here https://huggingface.co/raulc0399/pixart-alpha-hed-controlnet
they can be used with this pipeline
why does it have its own implementation of the HED detector? It doesn't work with the regular one that everyone uses? Have you tested it with the one from the controlnet_aux library?
@asomoza
why does it have its own implementation of the HED detector? It doesn't work with the regular one that everyone uses? Have you tested it with the one from the
controlnet_auxlibrary?
the sample above just used the HED class that the authors had in their repository, and that was used to train their HED controlnet.
but i just checked it it seems to be the same, or better said adapted, from the controlnet_aux
thanks, I'll give it a test later. I was asking because if it was trained with a custom HED detector which produces different results than the default one it will be really hard for people to use it.
It would be nice if you could post some results (images) in the PR description.
thanks, I'll give it a test later. I was asking because if it was trained with a custom HED detector which produces different results than the default one it will be really hard for people to use it.
using the HED from control_aux it looses some quality. will try some more tests with that one.
but i also have a training script that i am testing before creating a PR: https://github.com/raulc0399/PixArt-alpha/blob/master_train_controlnet_diffusers/controlnet/train_pixart_controlnet_hf.py
that can be used to train further models.
It would be nice if you could post some results (images) in the PR description.
will do.
i have to correct my previous comment. i was using the default params for HED, which converted the image to 512, if i use however 1024 it works as it should.
Thanks, the results looks nice, since we only have one controlnet, maybe do what @yiyixuxu suggested, lets start with a community pipeline first and then as it gets traction and we have more controlnets move it to core.
@asomoza
Thanks, the results looks nice, since we only have one controlnet, maybe do what @yiyixuxu suggested, lets start with a community pipeline first and then as it gets traction and we have more controlnets move it to core.
ok, i will move it to the examples folder and put there the training script as well. i have done some initial tests on the "fusing/fill50k" dataset to validate it works
@yiyixuxu @asomoza have moved all to the examples folder i have also added the training script. together with sh files for starting the training and for running the pipeline
@yiyixuxu the last commit moves the pipeline and the example on how to run it to examples/community
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
Please note that issues that do not follow the contributing guidelines are likely to be ignored.
@sayakpaul can you take a look to see if we can merge this now?
@raulc0399 can you run make style?
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.
@raulc0399 sorry about the delay here. But I think we need to have the ControlNet model and pipeline implementations under example/community/research_projects as suggested in https://github.com/huggingface/diffusers/pull/8857#discussion_r1687219351
@raulc0399 sorry about the delay here. But I think we need to have the ControlNet model and pipeline implementations under
example/community/research_projectsas suggested in #8857 (comment)
@sayakpaul currently the pipeline is in examples/community/pipeline_pixart_alpha_controlnet.py with the controlnet blocks and training script in examples/community/pixart - i thought this was requested in the link you quote.
so all is under examples/community
is it ok like this? or should i move it under reasearch_projects?
I think putting everything under research_projects could make more sense here. We could name that to be "pixart_controlnet". This way the ControlNet model, pipeline, and the training script could live under one directory.
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
Please note that issues that do not follow the contributing guidelines are likely to be ignored.
gentle pin @raulc0399 - are we interested in moving this to research folder?
@yiyixuxu yes will do, sorry for the delay.
@yiyixuxu i have moved the pipeline and training script under research_projects/pixart
@sayakpaul does this look good to you now?
May I request for the permission to push commit to raulc0399:main_pixart_alpha_controlnet, so that I can help do the make style and make quality. @raulc0399
May I request for the permission to push commit to raulc0399:main_pixart_alpha_controlnet, so that I can help do the
make style and make quality. @raulc0399
@lawrence-cj sure
ERROR: Permission to raulc0399/diffusers.git denied to lawrence-cj. Could not read from remote repository. Please make sure you have the correct access rights and the repository exists.
Seems I still cannot push commit to your branch.
@lawrence-cj i invited you just now as collaborator
Cool. @raulc0399. already run make style && make quality.
Gentle ping yiyi @yiyixuxu .
thank you @lawrence-cj @raulc0399
Thank you so much. Respect. @raulc0399 @sayakpaul @yiyixuxu