diffusers icon indicating copy to clipboard operation
diffusers copied to clipboard

Attention masks are missing in SD3 to mask out text padding tokens

Open reminisce opened this issue 1 year ago • 2 comments

Describe the bug

In the attention implementation of SD3, attention masks currently are not used. This will result in inconsistent outputs for the different values max_seq_length where padding exists in text tokens as the attention scores of padding tokens are non-zero. This issue has been discussed in https://github.com/huggingface/diffusers/discussions/8628, and is created to track the progress of fixing this problem.

Thanks @sayakpaul for the discussion.

Reproduction

n/a

Logs

No response

System Info

n/a

Who can help?

No response

reminisce avatar Jun 24 '24 05:06 reminisce

Hi @sayakpaul, I am interested in working on this issue

rootonchair avatar Jun 25 '24 02:06 rootonchair

Thanks for your interest! Sure, let’s go.

sayakpaul avatar Jun 25 '24 03:06 sayakpaul

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

github-actions[bot] avatar Sep 14 '24 15:09 github-actions[bot]

Hi guys, if no one working on this, I'm willing to pick this up 👍 @sayakpaul

SakshamDhawan avatar Oct 29 '24 13:10 SakshamDhawan

Gentle ping to keep the activity going. @rootonchair Would you be able to contribute the fix?

a-r-r-o-w avatar Nov 20 '24 02:11 a-r-r-o-w

Yes @a-r-r-o-w , I will open a PR soon

rootonchair avatar Nov 20 '24 16:11 rootonchair

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

github-actions[bot] avatar Dec 15 '24 15:12 github-actions[bot]