ColossalAI icon indicating copy to clipboard operation
ColossalAI copied to clipboard

[BUG]: `Segmentation fault (core dumped)` when running ldm on Cifer-10dataset

Open Celia0u0 opened this issue 2 years ago • 1 comments

🐛 Describe the bug

...
Setting up MemoryEfficientCrossAttention. Query dim is 320, context_dim is 1024 and using 5 heads.
DiffusionWrapper has 865.91 M params.
=========================================================================================
No pre-built kernel is found, build and load the cpu_adam kernel during runtime now
=========================================================================================
Emitting ninja build file /home/fangfei/.cache/colossalai/torch_extensions/torch1.13_cu11.7/build.ninja...
Building extension module cpu_adam...
Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
ninja: no work to do.
Loading extension module cpu_adam...
Time to load cpu_adam op: 2.413104295730591 seconds
=========================================================================================
No pre-built kernel is found, build and load the fused_optim kernel during runtime now
=========================================================================================
Detected CUDA files, patching ldflags
Emitting ninja build file /home/fangfei/.cache/colossalai/torch_extensions/torch1.13_cu11.7/build.ninja...
Building extension module fused_optim...
Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
ninja: no work to do.
Loading extension module fused_optim...
Time to load fused_optim op: 1.8385069370269775 seconds
Segmentation fault (core dumped)

when running stable diffusion with python main.py --logdir /tmp/ --train --base configs/train_colossalai_cifar10.yaml, and I cannot locate the cause of the error

Environment

pytorch-lightning ==1.8.1

Celia0u0 avatar Feb 27 '23 08:02 Celia0u0

There's maybe some problems with your --logdir, please check your /tmp folder whether it has enough space to store the log files.

MichelleMa8 avatar Mar 07 '23 07:03 MichelleMa8

We have updated a lot. This issue was closed due to inactivity. Thanks.

binmakeswell avatar Apr 26 '23 10:04 binmakeswell