diffusers [SD3] pipe.enable_xformers_memory_efficient

Describe the bug

RuntimeError: The size of tensor a (154) must match the size of tensor b (2304) at non-singleton dimension 1

Reproduction

# StableDiffusion3Pipeline
pipe.enable_xformers_memory_efficient_attention()

Logs

No response

System Info

diffusers==0.29.0 python=3.10 pytorch=2.3

Who can help?

@yiyixuxu @DN6 @sayakpaul

Jun 13 '24 15:06 CanvaChen

I don’t think we allow xformers attention in the SD3 blocks. Would you be interested in opening a PR? We will be happy to guide you.

Jun 13 '24 16:06 sayakpaul

It might be a bit challenging for me as my understanding of xformers is currently at the application level.

Jun 14 '24 01:06 CanvaChen

Oh that is okay. Here are a couple of reference pointers for you:

This is the standard xformers attention processor class: https://github.com/huggingface/diffusers/blob/f96e4a16adb4c31bab4c0a3d0d145ed2b086ecb0/src/diffusers/models/attention_processor.py#L1312
So, similarly, you would have to implement one for the joint attention processor block. More specifically, you will have to use ops from xformers in place of the native PyTorch ops here: https://github.com/huggingface/diffusers/blob/f96e4a16adb4c31bab4c0a3d0d145ed2b086ecb0/src/diffusers/models/attention_processor.py#L1135 (making sure the dimensions satisfy the criteria of xformers).

Does this seem like a feature you would be interested in contributing? Not only we would greatly appreciate it but also help you by providing guidance #8276

Let us know.

Jun 14 '24 21:06 sayakpaul

Thank you for your guidance. I’m interested in attempting this and plan to work on it during my free time after work.

Jun 16 '24 01:06 CanvaChen

That would be great! As mentioned we will be more than happy to guide you throughout.

Jun 16 '24 07:06 sayakpaul

I have implemented the XFormersJointAttnProcessor, but after calling pipe.enable_xformers_memory_efficient_attention(), self.attn.processor is always set to XFormersAttnProcessor. Where should I configure to select the correct processor?
Since the first issue has not been resolved, I did not call pipe.enable_xformers_memory_efficient_attention() and temporarily changed processor = JointAttnProcessor2_0() to processor = XFormersJointAttnProcessor() directly in the JointTransformerBlock for testing. I found that the attention_mask passed to the Processor is None. Why is this happening?

Jun 16 '24 10:06 CanvaChen

Thanks for your updates!

Do you want to open a PR with your implementation and tag myself and @yiyixuxu there?

It is perfectly okay to have it in an incomplete state.

Jun 16 '24 11:06 sayakpaul

I’ll submit the PR after I’ve finished testing to make sure there are no issues. Currently, the attention_mask parameter is None, and I’m not sure if this is a problem. Could you help clarify my concerns in the two points above?

Jun 16 '24 11:06 CanvaChen

I think None mask param value is fine.

Here is an example of how you can set the right processor: https://github.com/huggingface/diffusers/blob/a899e42fc78fbd080452ce88d00dbf704d115280/src/diffusers/models/attention_processor.py#L381

Jun 16 '24 11:06 sayakpaul

I have already opened a PR. Currently, the attention_mask is None, so I am not handling the attention_mask for now.

Jun 16 '24 12:06 CanvaChen

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

Sep 14 '24 15:09 github-actions[bot]

@sayakpaul Do we still want this? There was another discussion opened today regarding this? https://github.com/huggingface/diffusers/discussions/9681

I think there were plans of removing xformers support within the library for the future, but it would still be compatible with set_attn_processor() methods for users who prefer to use it, I think.

Oct 15 '24 21:10 a-r-r-o-w

Sure. @DN6 could you give a final review of https://github.com/huggingface/diffusers/pull/8583?

Oct 16 '24 02:10 sayakpaul

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

Nov 09 '24 15:11 github-actions[bot]

[SD3] pipe.enable_xformers_memory_efficient_attention

Describe the bug

Reproduction

Logs

System Info

Who can help?