diffusers icon indicating copy to clipboard operation
diffusers copied to clipboard

[core] Allegro T2V

Open a-r-r-o-w opened this issue 1 year ago • 4 comments

What does this PR do?

Model: https://huggingface.co/rhymes-ai/Allegro Github: https://github.com/rhymes-ai/Allegro

a-r-r-o-w avatar Oct 21 '24 22:10 a-r-r-o-w

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

It looks like something broke when doing the VAE refactor - looking into it at the moment. Will fix the broken tests afterwards

a-r-r-o-w avatar Oct 22 '24 23:10 a-r-r-o-w

Update: not a bug in the VAE - made a mistake in the transformer. Doing a more careful run through it again. @yiyixuxu repeat_interleave and pad temporal changes are in and work as expected, thanks!

a-r-r-o-w avatar Oct 23 '24 02:10 a-r-r-o-w

LGTM. There's a failing test that looks related to saving/loading the transformer.

DN6 avatar Oct 23 '24 09:10 DN6

I have something really weird happening here. Feeding in a list of prompts (even if the list is only one prompt) results in a really bad video. It might be related to https://github.com/huggingface/diffusers/pull/9769#discussion_r1817614450, but I can't be sure. It seems cfg baked, so it would make sense if it was a cfg error.

prompt = ["Orbital shot of a squirrel nibbles on a nut while sitting in a tree"]:

https://github.com/user-attachments/assets/11eaab98-9be6-44d9-8904-76e762d91c60

prompt = "Orbital shot of a squirrel nibbles on a nut while sitting in a tree":

https://github.com/user-attachments/assets/13d15a6d-de81-4abe-b077-2e15beb7774c

Ednaordinary avatar Oct 27 '24 00:10 Ednaordinary

@Ednaordinary Can you share a code snippet? Could you include how you're loading the pipeline? I'm unable to reproduce the issue.

DN6 avatar Oct 28 '24 15:10 DN6

I checked further, and it happens when the prompt and negative prompt are length one lists (even with the list being [None], I think), but not when the prompt is a list and negative is unspecified (my mistake). I have yet to test further

that's kinda incoherent, here's a snippet: video = model(["A squirrel sitting on a tree and nibbling on an acorn."], negative_prompt=[""], num_frames=88, num_inference_steps=20, guidance_scale=7.5)

I'm using UniPC since its way faster. Testing without the negative_prompt specified at all, it works fine. ~~I haven't tested with commit 9214f4a merged yet, though~~

Ednaordinary avatar Oct 28 '24 23:10 Ednaordinary

@Ednaordinary So the problem is that u are using empty negative prompt?

foreverpiano avatar Oct 29 '24 00:10 foreverpiano

@foreverpiano I don't believe so, as passing None in a list to negative_prompt also seems to trigger it. It also looks suspiciously like cfg baking, which I can't be certain but I feel as if negative prompting with nothing wouldn't cause

The way I'm passing in arguments is using the same interface I pass them in for other pipelines, which ive never had issues with. I use a batching mechanism that passes in multiple prompts, in a list, even if there's only one prompt. negative_prompt is converted to None of its blank. Changing this to only passing in a string instead of a list and only batching one image (multi-prompt doesn't currently seem to work on this pipeline regardless) fixed things, for whatever reason

[None] as negative_prompt:

https://github.com/user-attachments/assets/70b85a17-2590-4bf8-a2c1-caafb0ea7549

Ednaordinary avatar Oct 29 '24 02:10 Ednaordinary

@Ednaordinary I think should be fixed with the latest commit. LMK if it still persists - if so, will fix in a follow-up PR

a-r-r-o-w avatar Oct 29 '24 07:10 a-r-r-o-w