DiffSynth-Studio [Bug] Fix flash_attn_func Return Value Handling for flash-attn3 Compatibility in wan_video

🐛 问题描述在wan_video_dit模型的注意力模块中，当前flash_attn_func调用方式与flash-attn>=3.0.0b版本存在接口不兼容问题：原代码：

python x = flash_attn_interface.flash_attn_func(q, k, v) 报错信息： ValueError: too many values to unpack (expected 1)

🔍 根本原因 flash-attn3 beta版本接口变更： https://github.com/Dao-AILab/flash-attention/blob/main/hopper/flash_attn_interface.py#L518 显示函数现在返回Tuple[Tensor, ...]，至少需要两个返回值接收位

🛠 建议修改 diff

- x = flash_attn_interface.flash_attn_func(q, k, v)
+ x, _ = flash_attn_interface.flash_attn_func(q, k, v)  # 显式解包返回值

📌 附加备注 Beta版本标记：当前flash-attn3仍处于测试阶段，官方接口可能继续调整

Mar 20 '25 03:03 motoight

I want to try flash attention 3 but its compile fails on Windows :(

https://github.com/Dao-AILab/flash-attention/issues/1524

Mar 20 '25 17:03 FurkanGozukara

@motoight Thank you for your feedback. This is a time-sensitive issue, and we will keep an eye on it.

On the other hand, we understand that the PyTorch team is gradually integrating Flash Attention-related technologies. Currently, the SDPA in PyTorch supports Flash Attention 2, which offers greater stability compared to the original version. Therefore, we do not want users to develop a habit of additional installation of Flash Attention; instead, we aim to use PyTorch's implementation of Flash Attention uniformly. For now, we will continue to retain this functionality until we have completed these aspects in PyTorch.

Mar 24 '25 03:03 Artiprocher

@Artiprocher ty. can you add please GGUF support? that really makes models fit into lower VRAM with sacrifice of some quality

GGUF becoming very widely used

Mar 24 '25 13:03 FurkanGozukara

[Bug] Fix flash_attn_func Return Value Handling for flash-attn3 Compatibility in wan_video_dit Model