BiRefNet icon indicating copy to clipboard operation
BiRefNet copied to clipboard

Model Zoo - general use is matting or segment?

Open sandy-ssdut opened this issue 1 year ago • 4 comments

Great work!

I noticed that the "general use task" model in model zoo use many matting dataset for training, such as TR-P3M-10k. The GT mask value is between [0,1].

So my question is the "general use task" model is trained for matting or segmentation? If this model is trained as matting task, what is the difference in loss or other things from segmentation task like in DIS5K dataset?

Thanks very much!

sandy-ssdut avatar Aug 19 '24 11:08 sandy-ssdut

Thanks for your interest! The datasets are mixed -- with both binary segmentation ones and soft ones. However, concerning their number in the general datasets and the loss design for its training (mainly with BCE and IoU), it leans more toward binary segmentation.

ZhengPeng7 avatar Aug 19 '24 14:08 ZhengPeng7

Thanks for your reply~

The matting mask GTs are not pre-preprocessed as binary one, is used between [0, 1] in training process. Am I understand correctly?

In this situation:

  1. The IoU loss equals to 1 - intersection / union, and intersection = torch.sum(gt mask * pred mask). When the gt value of pixel is between [0, 1], for example 0.5 for all pixels in a image, the intersection = torch.sum (0.5 * pred mask). Compared with 0.5 for all pixels, the pred mask tend to be 1 for all pixels after training, because the IoU loss will be smaller.

  2. The BCE loss equals to - (ylogp + (1-y)log(1-p)) . When GT mask is 0.5 for all pixels, compared with 1, the pred mask tend to be 0.5, because the BCE loss is smaller.

This two losses are conflict?

sandy-ssdut avatar Aug 20 '24 02:08 sandy-ssdut

Yeah. That's part of the reason I mean the model leans more toward binary segmentation. For matting, it might bring some conflict, but the majority of used datasets have 0 or 1 in GT, and 0.5 appears a few times; the weights of BCE and IoU are also different. Thus, practically, it works as usual. I also know the differences between losses of segmentation and matting, but currently no GPUs for me to conduct experiments on it.

ZhengPeng7 avatar Aug 20 '24 04:08 ZhengPeng7

Got it. Thanks very much.

sandy-ssdut avatar Aug 20 '24 05:08 sandy-ssdut