DALLE-models icon indicating copy to clipboard operation
DALLE-models copied to clipboard

AssertionError: sparse attention only supports training in fp16 currently, please file a github issue if you need fp32 support

Open Penguin-jpg opened this issue 3 years ago • 5 comments

Hello, I used the colab notebook and choose 16L_64HD_8H_512I_128T_cc12m_cc3m_3E checkpoint and this error happened image

script: !python /content/dalle-pytorch-pretrained/DALLE-pytorch/generate.py --dalle_path=$checkpoint_path --taming --text="$text" --num_images=$num_images --batch_size=$batch_size --outputs_dir="$_folder"; wait;

variables: image

Penguin-jpg avatar Mar 18 '22 09:03 Penguin-jpg

I'm having the exact same issue

narangkay avatar Apr 11 '22 19:04 narangkay

Hi! I didn't find how to work around the colab notebook. But I am able to run the generator.py script. I find you have to roll back both deepspeed and triton to July 2021 version. For example, I use deepspeed 0.4.4 and triton 0.4.2, then it works!

sbyebss avatar Apr 22 '22 03:04 sbyebss

Hey! Is there a way to load the models with A100 with CUDA 11? It seems that both deepspeed 0.4.4 and triton 0.4.2 do not work with CUDA 11 but pre-tranined model require them. Thanks in advance!

tianjianh avatar Jun 22 '22 00:06 tianjianh

I have the same problem. Does any have suggestions? BTW, @sbyebss 's solution is not working for me (I got different error when using 0.4.4 deepspeed).

@Penguin-jpg were you able to resolve the issue?

Thanks in advance!

fdchiu avatar Nov 28 '22 00:11 fdchiu

@fdchiu Sorry for the late reply. I cannot resolve this issue with any methods I found.

Penguin-jpg avatar Jan 08 '23 06:01 Penguin-jpg