AssertionError: sparse attention only supports training in fp16 currently, please file a github issue if you need fp32 support
Hello, I used the colab notebook and choose 16L_64HD_8H_512I_128T_cc12m_cc3m_3E checkpoint and this error happened

script:
!python /content/dalle-pytorch-pretrained/DALLE-pytorch/generate.py --dalle_path=$checkpoint_path --taming --text="$text" --num_images=$num_images --batch_size=$batch_size --outputs_dir="$_folder"; wait;
variables:

I'm having the exact same issue
Hi! I didn't find how to work around the colab notebook. But I am able to run the generator.py script. I find you have to roll back both deepspeed and triton to July 2021 version. For example, I use deepspeed 0.4.4 and triton 0.4.2, then it works!
Hey! Is there a way to load the models with A100 with CUDA 11? It seems that both deepspeed 0.4.4 and triton 0.4.2 do not work with CUDA 11 but pre-tranined model require them. Thanks in advance!
I have the same problem. Does any have suggestions? BTW, @sbyebss 's solution is not working for me (I got different error when using 0.4.4 deepspeed).
@Penguin-jpg were you able to resolve the issue?
Thanks in advance!
@fdchiu Sorry for the late reply. I cannot resolve this issue with any methods I found.