CellProfiler-plugins icon indicating copy to clipboard operation
CellProfiler-plugins copied to clipboard

RunCellpose_Issues with GPU memory share setting

Open sugan89 opened this issue 1 year ago • 4 comments

RunCellpose plugin works well in a Python environment when the GPU memory share for each worker option is set to 1 but when the option is set to 0.1, I get the following error,

** TORCH CUDA version installed and working. **
>>>> using GPU
>>>> model diam_mean =  30.000 (ROIs rescaled to this size during training)
>>>> model diam_labels =  34.352 (mean diameter of training ROIs)
Unable to create masks. Check your module settings. CUDA out of memory. Tried to allocate 98.00 MiB. GPU 0 has a total capacity of 4.00 GiB of which 2.86 GiB is free. Of the allocated memory 254.49 MiB is allocated by PyTorch, and 97.51 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables)
Failed to run module RunCellpose
Traceback (most recent call last):
  File "C:\Users\ssivagur\Anaconda3\envs\CP_plugins\lib\site-packages\cellprofiler\gui\pipelinecontroller.py", line 3390, in do_step
    self.__pipeline.run_module(module, workspace_model)
  File "C:\Users\ssivagur\Anaconda3\envs\CP_plugins\lib\site-packages\cellprofiler_core\pipeline\_pipeline.py", line 1349, in run_module
    module.run(workspace)
  File "C:\Users\ssivagur\Documents\GitHub\CellProfiler-plugins\active_plugins\runcellpose.py", line 606, in run
    y.segmented = y_data
UnboundLocalError: local variable 'y_data' referenced before assignment```

sugan89 avatar Mar 01 '24 00:03 sugan89

Looking at the torch documentation, the function we use to do the memory sunsetting CLAIMS it works off fraction of the total memory, so the allocation should fit in 10%. Have not yet checked to see if this is a known bug

bethac07 avatar Mar 01 '24 00:03 bethac07

Might be related - https://forum.image.sc/t/cellprofiler-plugins-cellpose-stardist-gpu-memory-in-test-mode/95938

ShataDg avatar May 08 '24 14:05 ShataDg

This issue has been mentioned on Image.sc Forum. There might be relevant details there:

https://forum.image.sc/t/runcellpose-error/86328/13

imagesc-bot avatar Jun 17 '24 21:06 imagesc-bot

FYI, I just got the same UnboundLocalError: local variable 'y_data' referenced before assignment from line 606. I have Use GPU set to No in my pipeline. Running in Docker erinweisbart/distributed-cellprofiler:2.0.0_4.2.4_cellpose

EDIT: after updating my plugins on the Docker with a fresh git pull the error goes away!

ErinWeisbart avatar Jun 18 '24 18:06 ErinWeisbart