DiffSynth-Studio icon indicating copy to clipboard operation
DiffSynth-Studio copied to clipboard

[Bug] Fix flash_attn_func Return Value Handling for flash-attn3 Compatibility in wan_video_dit Model

Open motoight opened this issue 10 months ago • 3 comments

🐛 问题描述 在wan_video_dit模型的注意力模块中,当前flash_attn_func调用方式与flash-attn>=3.0.0b版本存在接口不兼容问题: 原代码:

python x = flash_attn_interface.flash_attn_func(q, k, v) 报错信息: ValueError: too many values to unpack (expected 1)

🔍 根本原因 flash-attn3 beta版本接口变更: https://github.com/Dao-AILab/flash-attention/blob/main/hopper/flash_attn_interface.py#L518 显示函数现在返回Tuple[Tensor, ...],至少需要两个返回值接收位

🛠 建议修改 diff

- x = flash_attn_interface.flash_attn_func(q, k, v)
+ x, _ = flash_attn_interface.flash_attn_func(q, k, v)  # 显式解包返回值

📌 附加备注 ​Beta版本标记:当前flash-attn3仍处于测试阶段,官方接口可能继续调整

motoight avatar Mar 20 '25 03:03 motoight

I want to try flash attention 3 but its compile fails on Windows :(

https://github.com/Dao-AILab/flash-attention/issues/1524

FurkanGozukara avatar Mar 20 '25 17:03 FurkanGozukara

@motoight Thank you for your feedback. This is a time-sensitive issue, and we will keep an eye on it.

On the other hand, we understand that the PyTorch team is gradually integrating Flash Attention-related technologies. Currently, the SDPA in PyTorch supports Flash Attention 2, which offers greater stability compared to the original version. Therefore, we do not want users to develop a habit of additional installation of Flash Attention; instead, we aim to use PyTorch's implementation of Flash Attention uniformly. For now, we will continue to retain this functionality until we have completed these aspects in PyTorch.

Artiprocher avatar Mar 24 '25 03:03 Artiprocher

@Artiprocher ty. can you add please GGUF support? that really makes models fit into lower VRAM with sacrifice of some quality

GGUF becoming very widely used

FurkanGozukara avatar Mar 24 '25 13:03 FurkanGozukara