DDPM-Pytorch icon indicating copy to clipboard operation
DDPM-Pytorch copied to clipboard

RuntimeError:CUDA out of memory

Open 2000lf opened this issue 1 year ago • 4 comments

Can I remove the attention layer for high resolution img?

2000lf avatar Nov 29 '24 05:11 2000lf

Yes you can try removing the attention to see if that gets rid of the error. Whats the image size you are working with ?

Adding few other things that will reduce the compute requirement in case you are working with the default config.

  1. Reduce Batch Size(config uses 64 as of now)
  2. Keep attention only in midblock and remove Downblock attention(here) and Upblock attention(here)
  3. By default the downsampling is disabled on last downblock(since mnist images are anyways small), so change this value to be [True, True, True]

explainingai-code avatar Nov 29 '24 06:11 explainingai-code

Yes you can try removing the attention to see if that gets rid of the error. Whats the image size you are working with ?

Adding few other things that will reduce the compute requirement in case you are working with the default config.

1. Reduce Batch Size(config uses 64 as of now)

2. Keep attention only in midblock and remove Downblock attention([here](https://github.com/explainingai-code/DDPM-Pytorch/blob/main/models/unet_base.py#L98-L105)) and Upblock attention([here](https://github.com/explainingai-code/DDPM-Pytorch/blob/main/models/unet_base.py#L275-L281))

3. By default the downsampling is disabled on last downblock(since mnist images are anyways small), so change [this](https://github.com/explainingai-code/DDPM-Pytorch/blob/main/config/default.yaml#L14) value to be `[True, True, True]`

Thank you for your advice,I use img with size 900*1600. That consume huge memory when encounter attention layer.

2000lf avatar Nov 29 '24 06:11 2000lf

Got it. Yeah try with a batch size of 1 first. If it works then you can train with gradient accumulation. But if that also fails then you would have to either remove attention layers or train with smaller sized images.

explainingai-code avatar Nov 29 '24 07:11 explainingai-code

Got it. Yeah try with a batch size of 1 first. If it works then you can train with gradient accumulation. But if that also fails then you would have to either remove attention layers or train with smaller sized images.

Thank you ,I will try as you told.

2000lf avatar Nov 29 '24 07:11 2000lf