[SD3] pipe.enable_xformers_memory_efficient_attention
Describe the bug
RuntimeError: The size of tensor a (154) must match the size of tensor b (2304) at non-singleton dimension 1
Reproduction
# StableDiffusion3Pipeline
pipe.enable_xformers_memory_efficient_attention()
Logs
No response
System Info
diffusers==0.29.0 python=3.10 pytorch=2.3
Who can help?
@yiyixuxu @DN6 @sayakpaul
I don’t think we allow xformers attention in the SD3 blocks. Would you be interested in opening a PR? We will be happy to guide you.
It might be a bit challenging for me as my understanding of xformers is currently at the application level.
Oh that is okay. Here are a couple of reference pointers for you:
- This is the standard xformers attention processor class: https://github.com/huggingface/diffusers/blob/f96e4a16adb4c31bab4c0a3d0d145ed2b086ecb0/src/diffusers/models/attention_processor.py#L1312
- So, similarly, you would have to implement one for the joint attention processor block. More specifically, you will have to use ops from
xformersin place of the native PyTorch ops here: https://github.com/huggingface/diffusers/blob/f96e4a16adb4c31bab4c0a3d0d145ed2b086ecb0/src/diffusers/models/attention_processor.py#L1135 (making sure the dimensions satisfy the criteria of xformers).
Does this seem like a feature you would be interested in contributing? Not only we would greatly appreciate it but also help you by providing guidance #8276
Let us know.
Thank you for your guidance. I’m interested in attempting this and plan to work on it during my free time after work.
That would be great! As mentioned we will be more than happy to guide you throughout.
- I have implemented the XFormersJointAttnProcessor, but after calling
pipe.enable_xformers_memory_efficient_attention(), self.attn.processor is always set to XFormersAttnProcessor. Where should I configure to select the correct processor? - Since the first issue has not been resolved, I did not call
pipe.enable_xformers_memory_efficient_attention()and temporarily changedprocessor = JointAttnProcessor2_0()toprocessor = XFormersJointAttnProcessor()directly in the JointTransformerBlock for testing. I found that the attention_mask passed to the Processor is None. Why is this happening?
Thanks for your updates!
Do you want to open a PR with your implementation and tag myself and @yiyixuxu there?
It is perfectly okay to have it in an incomplete state.
I’ll submit the PR after I’ve finished testing to make sure there are no issues. Currently, the attention_mask parameter is None, and I’m not sure if this is a problem. Could you help clarify my concerns in the two points above?
I think None mask param value is fine.
Here is an example of how you can set the right processor: https://github.com/huggingface/diffusers/blob/a899e42fc78fbd080452ce88d00dbf704d115280/src/diffusers/models/attention_processor.py#L381
I have already opened a PR. Currently, the attention_mask is None, so I am not handling the attention_mask for now.
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
Please note that issues that do not follow the contributing guidelines are likely to be ignored.
@sayakpaul Do we still want this? There was another discussion opened today regarding this? https://github.com/huggingface/diffusers/discussions/9681
I think there were plans of removing xformers support within the library for the future, but it would still be compatible with set_attn_processor() methods for users who prefer to use it, I think.
Sure. @DN6 could you give a final review of https://github.com/huggingface/diffusers/pull/8583?
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
Please note that issues that do not follow the contributing guidelines are likely to be ignored.