cellpose icon indicating copy to clipboard operation
cellpose copied to clipboard

Can't train cellpose using mask

Open ianmanifacier opened this issue 1 year ago • 1 comments

Hi, I have tried to retrain the cellpose 'cyto3' and 'livecell' using tif masks as indicated in the documentation.

0 = black = background each value between 1-255 corresponds to the mask of a single cell. frame_000.tif frame_000_masks.tif (click the link for example files)

When I launch the training set on 230 images the training process runs without error. However, after the training end I use the newly trained model located in the folder, I get worse results.

I must be doing something wrong but I don't know what.

I tried formatting the tif file in 8 and 16 bits.

Here is a preview image

The python code I used to train cellpose

import os, shutil, time,sys
import numpy as np
import matplotlib.pyplot as plt

from cellpose import io, models, train

from urllib.parse import urlparse
import skimage.io

import matplotlib as mpl
#%matplotlib inline
mpl.rcParams['figure.dpi'] = 300

from urllib.parse import urlparse
from cellpose import models, core

use_GPU = core.use_gpu()
print('>>> GPU activated? %d'%use_GPU)

io.logger_setup()

train_dir = "C:/Users/Stagiaire/Desktop/masks_cellpose/cyto3/no_resize/train_dir"
test_dir = "C:/Users/Stagiaire/Desktop/masks_cellpose/cyto3/no_resize/test_dir"
image_filter = ""
mask_filter="_masks"
output = io.load_train_test_data(train_dir, test_dir, image_filter=image_filter, mask_filter=mask_filter, look_one_level_down=False)
images, labels, image_names, test_images, test_labels, image_names_test = output

# e.g. retrain a Cellpose model
model = models.CellposeModel(gpu=use_GPU, model_type="cyto3")

model_path = train.train_seg(model.net, train_data=images, train_labels=labels, n_epochs = 2000,
                            channels=[1,0], normalize=True,
                            test_data=test_images, test_labels=test_labels)

Thank you for your help Ian

ianmanifacier avatar May 29 '24 15:05 ianmanifacier

for retraining, you want to use the Cellpose 2 settings, which are provided in the example in the docs:

model_path = train.train_seg(model.net, train_data=images, train_labels=labels,
                            channels=[1,2], normalize=True,
                            test_data=test_images, test_labels=test_labels,
                            weight_decay=1e-4, SGD=True, learning_rate=0.1, # <-- these are the retraining params
                            n_epochs=300, model_name="my_new_model")

https://cellpose.readthedocs.io/en/latest/train.html

We found that SGD usually works best when retraining, and AdamW works better when training from scratch. We will add more documentation on this

carsen-stringer avatar Jun 05 '24 22:06 carsen-stringer

hi, im using the exemple on teh website to train on my data for the first time but i get this error everytime and i dont know why. please help! the code :

Load the data

output = io.load_train_test_data(train_dir, test_dir, image_filter="_img", mask_filter="_masks", look_one_level_down=False) images, labels, image_names, test_images, test_labels, image_names_test = output

train the model

model = models.CellposeModel(gpu=True,diam_mean=49,pretrained_model=False,model_type="cyto3") model_path = train.train_seg(model.net, train_data=images, train_labels=labels, channels=[2, 3], normalize=True, test_data=test_images, test_labels=test_labels, weight_decay=1e-4, SGD=True, learning_rate=0.1, n_epochs=3, model_name="my_new_model") the error : creating new log file 2024-07-26 13:46:47,138 [INFO] WRITING LOG OUTPUT TO C:\Users\samir.cellpose\run.log 2024-07-26 13:46:47,138 [INFO] cellpose version: 3.0.10 platform: win32 python version: 3.9.19 torch version: 2.5.0.dev20240724 2024-07-26 13:46:47,206 [INFO] not all flows are present, running flow generation for all images 2024-07-26 13:46:49,088 [INFO] 1010 / 1010 images in C:/Users/samir/Documents/cellpose/data/train folder have labels 2024-07-26 13:46:49,088 [INFO] not all flows are present, running flow generation for all images 2024-07-26 13:46:49,331 [INFO] 140 / 140 images in C:/Users/samir/Documents/cellpose/data/test folder have labels 2024-07-26 13:46:49,331 [INFO] >> cyto3 << model set to be used 2024-07-26 13:46:49,444 [INFO] ** TORCH CUDA version installed and working. ** 2024-07-26 13:46:49,444 [INFO] >>>> using GPU 2024-07-26 13:46:49,444 [INFO] >>>> using GPU 2024-07-26 13:46:49,476 [INFO] >>>> loading model C:\Users\samir.cellpose\models\cyto3 C:\Users\samir\anaconda3\envs\cellpose\lib\site-packages\cellpose\resnet_torch.py:276: FutureWarning: You are using torch.load with weights_only=False (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for weights_only will be flipped to True. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via torch.serialization.add_safe_globals. We recommend you start setting weights_only=True for any use case where you don't have full control of the loaded file. Pleaseess they are explicitly allowlisted by the user via torch.serialization.add_safe_globals. We recommend you start setting weights_only=True for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature. open an issue on GitHub for any issues related to this experimental feature. state_dict = torch.load(filename, map_location=device) 2024-07-26 13:46:49,507 [INFO] >>>> model diam_mean = 49.000 (ROIs rescaled to this size during training) state_dict = torch.load(filename, map_location=device) 2024-07-26 13:46:49,507 [INFO] >>>> model diam_mean = 49.000 (ROIs rescaled to this size during training) 2024-07-26 13:46:49,507 [INFO] >>>> model diam_mean = 49.000 (ROIs rescaled to this size during training) 2024-07-26 13:46:49,507 [INFO] computing flows for labels 2024-07-26 13:46:49,507 [INFO] computing flows for labels 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1010/1010 [01:24<00:00, 11.93it/s] 2024-07-26 13:48:14,235 [INFO] computing flows for labels 34%|██████████████████████████████████████████████████████████████ | 48/140 [00:03<00:07, 12.52it/s]2024-07-26 13:48:18,259 [WARNING] empty masks! 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 140/140 [00:11<00:00, 12.54it/s] 2024-07-26 13:48:25,418 [INFO] >>> computing diameters 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1010/1010 [00:00<00:00, 6556.14it/s] 0%| | 0/140 [00:00<?, ?it/s]C:\Users\samir\anaconda3\envs\cellpose\lib\site-packages\numpy\core\fromnumeric.py:3504: RuntimeWarning: Mean of empty slice. return _methods._mean(a, axis=axis, dtype=dtype, C:\Users\samir\anaconda3\envs\cellpose\lib\site-packages\numpy\core_methods.py:129: RuntimeWarning: invalid value encountered in scalar divide ret = ret.dtype.type(ret / rcount) 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 140/140 [00:00<00:00, 8403.25it/s] 2024-07-26 13:48:25,589 [WARNING] 1010 train images with number of masks less than min_train_masks (5), removing from train set 2024-07-26 13:48:25,589 [INFO] >>> using channels [2, 3] 2024-07-26 13:48:25,589 [INFO] >>> normalizing {'lowhigh': None, 'percentile': None, 'normalize': True, 'norm3D': False, 'sharpen_radius': 0, 'smooth_radius': 0, 'tile_norm_blocksize': 0, 'tile_norm_smooth3D': 1, 'invert': False} C:\Users\samir\anaconda3\envs\cellpose\lib\site-packages\cellpose\train.py:406: RuntimeWarning: Mean of empty slice. net.diam_labels.data = torch.Tensor([diam_train.mean()]).to(device) 2024-07-26 13:48:25,784 [INFO] >>> n_epochs=3, n_train=0, n_test=140 2024-07-26 13:48:25,784 [INFO] >>> SGD, learning_rate=0.10000, weight_decay=0.00010, momentum=0.900 2024-07-26 13:48:26,328 [INFO] >>> saving model to C:\Users\samir\Documents\cellpose\models\my_new_model Traceback (most recent call last): File "c:\Users\samir\Documents\cellpose\train_cellpose.py", line 14, in model_path = train.train_seg(model.net, train_data=images, train_labels=labels, channels=[2, 3], normalize=True, test_data=test_images, test_labels=test_labels, weight_decay=1e-4, SGD=True, learning_rate=0.1, n_epochs=3, model_name="my_new_model") File "C:\Users\samir\anaconda3\envs\cellpose\lib\site-packages\cellpose\train.py", line 516, in train_seg lavg /= nsum ZeroDivisionError: division by zero

sami22ivr avatar Jul 26 '24 12:07 sami22ivr

you are receiving that error because you have zero training images:

2024-07-26 13:48:25,784 [INFO] >>> n_epochs=3, n_train=0, n_test=140

closing this issue for now but let me know if you have more issues

carsen-stringer avatar Sep 10 '24 07:09 carsen-stringer