DALLE-models AssertionError: sparse attention only supports training in fp16 currently, please file a github issue if you need fp32 support

Hello, I used the colab notebook and choose 16L_64HD_8H_512I_128T_cc12m_cc3m_3E checkpoint and this error happened

script: !python /content/dalle-pytorch-pretrained/DALLE-pytorch/generate.py --dalle_path=$checkpoint_path --taming --text="$text" --num_images=$num_images --batch_size=$batch_size --outputs_dir="$_folder"; wait;

variables:

Mar 18 '22 09:03 Penguin-jpg

I'm having the exact same issue

Apr 11 '22 19:04 narangkay

Hi! I didn't find how to work around the colab notebook. But I am able to run the generator.py script. I find you have to roll back both deepspeed and triton to July 2021 version. For example, I use deepspeed 0.4.4 and triton 0.4.2, then it works!

Apr 22 '22 03:04 sbyebss

Hey! Is there a way to load the models with A100 with CUDA 11? It seems that both deepspeed 0.4.4 and triton 0.4.2 do not work with CUDA 11 but pre-tranined model require them. Thanks in advance!

Jun 22 '22 00:06 tianjianh

I have the same problem. Does any have suggestions? BTW, @sbyebss 's solution is not working for me (I got different error when using 0.4.4 deepspeed).

@Penguin-jpg were you able to resolve the issue?

Thanks in advance!

Nov 28 '22 00:11 fdchiu

@fdchiu Sorry for the late reply. I cannot resolve this issue with any methods I found.

Jan 08 '23 06:01 Penguin-jpg