diffusers
diffusers copied to clipboard
question about attention block
why we use torch.baddbmm to do query @ key? https://github.com/huggingface/diffusers/blob/main/src/diffusers/models/attention.py#L640
In our experiments this was the fastest way of doing the query @ key computation :-) See: https://github.com/huggingface/diffusers/pull/371 https://github.com/huggingface/diffusers/pull/511