Fused attention error while running Nvidia Cosmos
Hello
I am trying to run the latest Nvidia Cosmos model on a RTX 4090 and I get an error when fused attention is called : Line 1080 in fused_attn.py / fused_attn_forward /output_tensors = tex.fused_attn_fwd(...)
The transformer_engine compilation didn't produce any error during the installation and I have CuDNN v9.6.0 installed. I have Flash Attention 2.7.3, could this be the issue (there is a warning that says that 2.6.3. is the latest supported) ? is Flash attention used behind the scene ?
E! CuDNN (v90100 70) function cudnnBackendFinalize() called: e! Error: CUDNN_STATUS_EXECUTION_FAILED; Reason: rtc->loadModule() e! Error: CUDNN_STATUS_EXECUTION_FAILED; Reason: ptr.isSupported() e! Error: CUDNN_STATUS_EXECUTION_FAILED; Reason: engine_post_checks(*engine_iface, engine.getPerfKnobs(), req_size, engine.getTargetSMCount()) e! Error: CUDNN_STATUS_EXECUTION_FAILED; Reason: finalize_internal() e! Time: 2025-01-14T00:21:03.310308 (0d+0h+3m+34s since start) e! Process=381629; Thread=381629; GPU=NULL; Handle=NULL; StreamId=NULL.
Many thanks in davance
i got the same problem, but I am not sure how to resolve
I have fixed it by installing an earlier version of Flash attention (2.6.0 ?)
thanks for your information, do you mean by installing:
python -m pip install git+https://github.com/Dao-AILab/[email protected]
By the way, since I do not have the sudo, so for cosmos, I tried to install everything myself, I installed torch 2.4/2.5 with cuda version 12.1, 12.4, then i installed nvcc and Cuda toolkit with the same version as cuda, then install cudnn 9.3, all from conda. I am not sure what caused this problem
@andypinxinliu I have the same problem as solved by: https://github.com/Dao-AILab/flash-attention/issues/1421#issuecomment-2575547768