segmentation_models Multiclass segmentation example unlabeled pixels

I have been trying to adapt examples/multiclass segmentation (camvid).ipynb for my imaging context where I have a lot of unannotated pixels annotated with the value 0 in the mask (other than 0 I have multiple classes such as 1 = normal, 2=tumor etc).

I noticed that all of the labels in the multi-class example in this repository don't correspond to unlabelled, while the CLASSES variable does have an unlabeled category but on the last (rather than the first in my case) label (that is number 12).

How do I handle the case of unlabelled pixels in my case? Do I have to set it up as the last class rather than the first so it's most compatible with this workflow? It's not clear to me how is that 'background' channel/category handled in the learning. I basically have a sparse annotation because I'm working with very large images (pathology), so I would ideally have the model ignore the unannotated pixels if possible. Is it something I do with the weights or is there another way?

Also, in case my patches are of size 512 x 512, what are the necessary changes to introduce in the learning model?

Thanks

Jun 05 '20 03:06 asmagen

What I did for my background or ignore class (a catch-all class for things you don't care about) is use a loss function that accepts weights for each class category, set all of the weights equal to 1 except for the ignore class, which I set to 0.

class_names = ['cat', 'dog', 'bird', 'ignore']
class_weights = [1, 1, 1, 0]

loss_function = some_Loss_Function(class_weights = class_weights

By doing this, your model will essentially ignore that class category and it will not make predictions for that class, but if those things you labeled as background are present in the test images, they will be predicted as one of the other classes (which may or may not be what you want, not sure).

Jun 05 '20 15:06 JordanMakesMaps

Thanks @JordanMakesMaps

So I have to make sure all of the un-labeled pixels are assigned to a background category and that this category gets a zero weight. So that means that even if you have a bunch of pixels that should have been annotated to one of the classes, it won’t affect the model to confuse background and that category?

And again about my last question: Also, in case my patches are of size 512 x 512, what are the necessary changes to introduce in the learning model?

Thanks

Jun 05 '20 18:06 asmagen

So that means that even if you have a bunch of pixels that should have been annotated to one of the classes, it won’t affect the model to confuse background and that category?

It won't affect the model because it won't see them, they'll be ignored, but they also won't benefit from them either. If you have unannotated pixels where a tumor is and you set the class weight for the ignore class to zero, then those unannotated pixels won't been see by the model during training. It'll be like blacking them out essentially.

But when your model is done being trained and you feed it test images during the inference phase, the model will still make predictions for all of the other classes (which I believe is what you want).

'Also, in case my patches are of size 512 x 512, what are the necessary changes to introduce in the learning model?'

What have you tried so far? Definitely use the preprocess_input() function for whichever backbone you choose (and make sure you also use it when you're making predictions on the images during the inference phase). The masks should be in one-hot-encoded format, which can be done using keras.utils.to_categorical(). Maybe do some augmentations to help reduce overfitting (I recommend ImgAug).

Jun 05 '20 20:06 JordanMakesMaps

Thanks @JordanMakesMaps

Regarding the patch size set to 512 in my case, I'm confused about what to change in the example notebook here because of the following statements and code mentioned/used there:

(1) All images have 320 pixels height and 480 pixels width. For more inforamtion about dataset visit http://mi.eng.cam.ac.uk/research/projects/VideoRec/CamVid/. (2)``` A.PadIfNeeded(min_height=320, min_width=320, always_apply=True, border_mode= A.RandomCrop(height=320, width=320, always_apply=True),

(3)```
# check shapes for errors
assert train_dataloader[0][0].shape == (BATCH_SIZE, 320, 320, 3)
assert train_dataloader[0][1].shape == (BATCH_SIZE, 320, 320, n_classes)

Does it mean that the original 320x480 are being scaled to being a square of size 320x320? Simply changing everything to 512 doesn't work.

Also, there is an issue suggesting a bug in the same notebook. Is it something I should revise?

Finally, what pre-trained model is most suitable to histopathology imaging tasks? How can I use it in this tutorial?

Thanks

Jun 05 '20 21:06 asmagen

You don't need to use the same settings as the notebook. If you want, you could change those lines to fit your needs, or just set the model to take in images of (None, None, num_classes) when you initiate it. By setting it to (None, None, num_classes), you're telling the model to learn an arbitrary shape, which is convenient because during testing time, you can provide it images of any size (note that this doesn't work for all architecture, but it does for U-Net).

In the notebook, he adds the padding function so that each side is evenly divisible by 32, which is a requirement for the U-Net architecture. Your images of 512 x 512 already meet that criteria so changing all of the lines with 320 x 480 into 512 x 512 will ensure that your images will not be altered.

with regards to the bug, you could do something like this:

def __getitem__(self, i):
        
        # collect batch data
        start = i * self.batch_size
        stop = (i + 1) * self.batch_size
        data = []
        for j in range(start, stop):
            data.append(self.dataset[self.indexes[j]]) # <------
        
        # transpose list of lists
        batch = [np.stack(samples, axis=0) for samples in zip(*data)]
        
        return batch

I'm not sure which model you should use, all of their weights correspond to ImageNet, and not histopathology images. You'll need to fine-tune the models for your needs. I would recommend U-Net and one of the efficientnets as the encoder, or maybe one of the densenets; I've had good experience with those on my datasets, but you'll have to do some experiments yourself.

Jun 06 '20 13:06 JordanMakesMaps

Thanks, I've implemented the size suggestion.

This is the error I get after using 512 size parameters and the (None, None, 3) as you suggested.

Epoch 1/40
---------------------------------------------------------------------------
ResourceExhaustedError                    Traceback (most recent call last)
<ipython-input-41-859d1e145522> in <module>()
      6     callbacks=callbacks,
      7     validation_data=valid_dataloader,
----> 8     validation_steps=len(valid_dataloader),
      9 )

9 frames
/usr/local/lib/python3.6/dist-packages/tensorflow/python/eager/execute.py in quick_execute(op_name, num_outputs, inputs, attrs, ctx, name)
     58     ctx.ensure_initialized()
     59     tensors = pywrap_tfe.TFE_Py_Execute(ctx._handle, device_name, op_name,
---> 60                                         inputs, attrs, num_outputs)
     61   except core._NotOkStatusException as e:
     62     if name is not None:

ResourceExhaustedError:  OOM when allocating tensor with shape[8,816,16,16] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
	 [[node block6a_dwconv/depthwise (defined at /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:3009) ]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.
 [Op:__inference_keras_scratch_graph_82103]

Function call stack:
keras_scratch_graph

You actually suggested in this issue using (None, None, num_classes) while in the other issue (None, None, 3). Which one is correct? Can you clarify what is happening in your suggested getitem function? What is the range and list of lists? Isn't it supposed to return one image?

Thanks

Jun 06 '20 15:06 asmagen

"OOM when allocating tensor"

You're running out of memory. So your computer doesn't have a GPU that can allocate the amount of memory necessary to train that particular model with the set dimensions and/or batch size. The ways to fix this are either: get another/better GPU, or reduce the dimensions of the image, the batch size, or use a different model architecture.

Your batch size should be one if you want to have the largest dimensions of image for training.

There is not an easy way to know what size of images you can use given the amount of memory you have available except for trial and error. If I were you, I would just keep decreasing the dimensions until you find the largest size of image your computer can handle (for example, I use a 1080 Ti and I can use 736 x 736 with a batch size of one using UNet and EfficientNets -b< 4).

Input shape for the image should be (None, None, number_of_channels_of_image), sorry about that.

Jun 06 '20 15:06 JordanMakesMaps

Hi, I am still very confused about the way "background" pixels are treated. I have a lesion dataset and my masks have values of 255 for lesion class and 0 for the background. If I treat "background" as a separate class, does it mean I have n_classes=2 and should use sigmoid for activation?

In fact if I add "background" as a separate class, then my mask has a shape of (height, width, 2) and the Dataset class (in the example notebook) adds another mask for "background" when mask.shape[-1] is not equal to 1, so I end up having mask shape of (height, width, 3). In that case, I get an AssertionError because my n_classes=2 but the shape of dataloader is (batchsize, height, width, 3). What is the right way of treating background? @asmagen How did you do in your case?

class Dataset:
    """CamVid Dataset. Read images, apply augmentation and preprocessing transformations.
    
    Args:
        images_dir (str): path to images folder
        masks_dir (str): path to segmentation masks folder
        class_values (list): values of classes to extract from segmentation mask
        augmentation (albumentations.Compose): data transfromation pipeline 
            (e.g. flip, scale, etc.)
        preprocessing (albumentations.Compose): data preprocessing 
            (e.g. noralization, shape manipulation, etc.)
    
    """
    
    # CLASSES = ['lesion'] 
    
    def __init__(
            self, 
            images_dir, 
            masks_dir, 
            classes=None, 
            augmentation=None, 
            preprocessing=None,
    ):
        self.ids = os.listdir(images_dir)
        self.images_fps = [os.path.join(images_dir, image_id) for image_id in self.ids]
        self.masks_fps = [os.path.join(masks_dir, image_id) for image_id in self.ids]
        
        # convert str names to class values on masks
        # self.class_values = [self.CLASSES.index(cls.lower()) for cls in classes] # for this, should be class1=1, class2=2,..
        self.class_values = [255, 0] # manuel entry: lesion pixel value=255, background=0
        
        self.augmentation = augmentation
        self.preprocessing = preprocessing

    def __getitem__(self, i):
        
        # read data
        image = cv2.imread(self.images_fps[i])
        image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
        mask = cv2.imread(self.masks_fps[i], 0) # loads in grayscale mode --> shape: (height, width)
        # print(np.unique(mask)) # [0, 255]

        # extract certain classes from mask (e.g. cars)
        masks = [(mask == v) for v in self.class_values] 
        mask = np.stack(masks, axis=-1).astype('float') 
        # print(mask.shape) # if single class, shape is (height, width, 1)

        # add background if mask is not binary
        if mask.shape[-1] != 1:
            background = 1 - mask.sum(axis=-1, keepdims=True)
            mask = np.concatenate((mask, background), axis=-1) # adds another channel for background
....
CLASSES = ['lesion']
# CLASSES = ['lesion', 'background]
....
# define network parameters
n_classes = 1 if len(CLASSES) == 1 else (len(CLASSES) + 1)  # case for binary and multiclass segmentation
activation = 'sigmoid' if n_classes == 1 else 'softmax'
....
# check shapes for errors
assert train_dataloader[0][0].shape == (BATCH_SIZE, 320, 320, 3)
assert train_dataloader[0][1].shape == (BATCH_SIZE, 320, 320, n_classes)

Sep 11 '20 11:09 gizemtanriver

So the labels you use need to somehow incorporate all of the class categories that are present in the all of the dataset. If you have images that have lesions colored as 0 in the corresponding mask, while EVERYTHING else in the image in colored 255, then you have two classes. I'm assuming that your images of lesions are the standard images you see in most pathology datasets, where legions are colored gray whereas everything else is just black? If so, then yes, you have two classses (thus you should use 'sigmoid' instead of 'softmax', and changed the dataloader to 2). Regardless of which dataset you're using, if there are pixels that are not of interest, they still need to be labeled something (such as background) so that model knows what's going on with those pixels in the image.

With the CamVid example there are many classes (I think 14 with this version of the dataset), but the author only wants to look at like, 3 classes? So he grouped everything that wasn't in the 3 classes of interests into another class called 'background'. So, the model still learns all of the other classes, but it thinks they all belong to the class 'background', and therefore they get grouped together. Does that make sense?

Sep 11 '20 17:09 JordanMakesMaps

@JordanMakesMaps So in that case I have two classes 'lesion' and 'background'. Does that mean that my target shape should be (batchsize, height, width, 2)? and the length of self.class_values should be 2? Should I be passing ['lesion', 'background'] instead of ['lesion'] as the classes when creating the dataset object? I thought in the binary case, information regarding the background pixels are already captured in my object's mask, so there is no need to add background pixels as another mask?

# Dataset for train images
train_dataset = Dataset(
    x_train_dir, 
    y_train_dir, 
    classes=CLASSES, 
    augmentation=get_training_augmentation(),
    preprocessing=get_preprocessing(preprocess_input),
)

The following codes in the notebook doesn't make any sense if my classes are ['lesion', 'background'] because n_classes will be computed as 3 according to this.

# define network parameters
n_classes = 1 if len(CLASSES) == 1 else (len(CLASSES) + 1)  # case for binary and multiclass segmentation
activation = 'sigmoid' if n_classes == 1 else 'softmax'

# check shapes for errors
assert train_dataloader[0][0].shape == (BATCH_SIZE, 320, 320, 3)
assert train_dataloader[0][1].shape == (BATCH_SIZE, 320, 320, n_classes)

Sep 11 '20 19:09 gizemtanriver

Haven't tested, but maybe try this? If you have any errors report back and then send like, 3 images and their corresponding masks and then I'll actually be able to help out and we'll figure this out.

Cheers.

class Dataset:
    """CamVid Dataset. Read images, apply augmentation and preprocessing transformations.
    
    Args:
        images_dir (str): path to images folder
        masks_dir (str): path to segmentation masks folder
        class_values (list): values of classes to extract from segmentation mask
        augmentation (albumentations.Compose): data transfromation pipeline 
            (e.g. flip, scale, etc.)
        preprocessing (albumentations.Compose): data preprocessing 
            (e.g. noralization, shape manipulation, etc.)
    
    """
 
    
    def __init__(
            self, 
            images_dir, 
            masks_dir, 
            classes=None, 
            augmentation=None, 
            preprocessing=None,
    ):
        self.ids = os.listdir(images_dir)
        self.images_fps = [os.path.join(images_dir, image_id) for image_id in self.ids]
        self.masks_fps = [os.path.join(masks_dir, image_id) for image_id in self.ids]
        
        # convert str names to class values on masks
        self.class_values = [self.CLASSES.index(cls.lower()) for cls in classes] # no need to do it manually now

        self.augmentation = augmentation
        self.preprocessing = preprocessing

    def __getitem__(self, i):
        
        # read data
        image = cv2.imread(self.images_fps[i])
        image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
        mask = cv2.imread(self.masks_fps[i], 0) 
        mask[mask == 255] = 1 # <--- Swap 255 with 1
        # print(np.unique(mask)) # [0, 1] <--- 

        # extract certain classes from mask (e.g. cars)
        masks = [(mask == v) for v in self.class_values]  # <--- class_values contains [0, 1], masks should have len() == 2
        mask = np.stack(masks, axis=-1).astype('float')  # <--- should print (H, W, 2)
        # print(mask.shape) 
    
        # removed all of this
...

CLASSES = ['background', 'lesion'] # <--- Swapped to correspond to correct ordering
....
# define network parameters
n_classes = 2
activation = 'sigmoid' 
....
# check shapes for errors
assert train_dataloader[0][0].shape == (BATCH_SIZE, 320, 320, 2)
assert train_dataloader[0][1].shape == (BATCH_SIZE, 320, 320, n_classes)

Sep 13 '20 03:09 JordanMakesMaps

This seems to work. Thanks! I have also tested it with softmax and categorical focal loss just to see what happens, but it doesn't fit very well. So better use sigmoid and binary loss as you said.

Sep 13 '20 10:09 gizemtanriver

hey, actually I don't understand why we add one class more in the case of multiclass segmentation. In my case, I have 5 classes and I define them like that :

CLASSES = ['facade', 'window', 'obstacle', 'sky', 'door']

but this line n_classes = 1 if len(CLASSES) == 1 else (len(CLASSES) + 1) adds one more class. Even if I change it to n_classes = 1 if len(CLASSES) == 1 else (len(CLASSES)) and then the number of classes looks correct, I get this error during training

Incompatible shapes: [8,320,320,6] vs. [8,320,320,5]
	 [[{{node training_1/Adam/gradients/loss_1/softmax_loss/binary_focal_loss_plus_dice_loss/mul_6_grad/BroadcastGradientArgs}}]]

If I keep the settings as default in the notebook, I have n_classes=6 but then during training I get randomly other shape mismatch errors. In this case, I got this

Incompatible shapes: [1,384,480,6] vs. [1,48,1]
	 [[{{node loss/softmax_loss/binary_focal_loss_plus_dice_loss/mul_6}}]]

in the end of the first epoch but this error happens at a different time when I change the validation_steps

My shapes:

print('train_dataloader length:', len(train_dataloader))
print('valid_dataloader length:', len(valid_dataloader))
print('train_dataloader shape:', train_dataloader[0][0].shape)
print('valid_dataloader shape:', valid_dataloader[0][0].shape)

train_dataloader length: 20
valid_dataloader length: 46
train_dataloader shape: (8, 320, 320, 3)
valid_dataloader shape: (1, 384, 480, 3)

Any help would be appreciated!

Feb 15 '22 10:02 elliestath