Synb0-DISCO icon indicating copy to clipboard operation
Synb0-DISCO copied to clipboard

Docker - "Performing inference on FOLD: 1 /extra/pipeline.sh: line 38: 1206 Killed"

Open mojomattv opened this issue 3 years ago • 4 comments

Hi and thank you very much for sharing this program. I was hoping for some advice on an error that keeps being thrown when running the following command:

sudo docker run --rm -v $(pwd)/INPUTS/:/INPUTS/ -v $(pwd)/OUTPUTS:/OUTPUTS/ -v $(pwd)/INPUTS/license.txt:/extra/freesurfer/license.txt --user $(id -u):$(id -g) hansencb/synb0 --notopup

Everything seems to run smoothly until the stage of "Performing inference on FOLD: *" stage, when there is apparently an issue with line 38 of pipeline.sh . Any clarification or possible solution would be greatly appreciate (see attached for output of the above command).

synb0_output.txt

mojomattv avatar Apr 26 '22 21:04 mojomattv

Hi mojomattv! I got the same error as you. My environment is as follows: Mac OS Monterey, Docker version 4.8.2, Synb0-DISCO v2.0. I increased the resource allocation for docker (from Docker Desktop>settings>resources), and It worked perfectly! If you use Mac, you can try this. img

Kikubernetes avatar Jul 05 '22 13:07 Kikubernetes

Dear List, I am performing the same without docker and getting stuck at exact step where @mojomattv was struggling. I am performing this in Red-hat 7 by modifying my pipeline.sh file. The error is RuntimeError: CUDA out of memory. Tried to allocate 38.00 MiB (GPU 0; 1.94 GiB total capacity; 908.41 MiB already allocated; 17.38 MiB free; 916.00 MiB reserved in total by PyTorch)

Please find attached my pipeline and the output file .

Kindly suggest the needful Reduction of batch size is a suggested but how to do that? or Is there a way by which we can increase the resource allocation for this program

Thanks and regards Himanshu Joshi

Synb0_output.txt pipeline.txt i

anshuhim20 avatar Aug 02 '22 08:08 anshuhim20

HI all,

Kikubernetes is correct, this is failing at the inference stage and is only due to memory. We suggest allocating 16Gb of RAM (or more) when running either the Docker or Singularity image.

Thank you, Kurt

Diffusion-MRI avatar Aug 02 '22 12:08 Diffusion-MRI

Might it be worth updating the README to reflect this? Currently it states "we suggest giving Docker access to >8Gb of RAM". I also had this issue on my desktop machine (I allocated 15G out of 16 available) so ran it in singularity on an HCP cluster instead. The job only reported a maxvmem of 13.110GB though.

fionaEyoung avatar Nov 01 '22 15:11 fionaEyoung

Hi Fiona and all - great suggestion. We have now reflected this in the README. Also - I apologize for the delayed response!

schillkg avatar Apr 05 '23 21:04 schillkg