label-maker icon indicating copy to clipboard operation
label-maker copied to clipboard

add tile ratio warning when there is class imbalance

Open Geoyi opened this issue 7 years ago • 1 comments

@drewbo, should we add class imbalance warning?

When I create a bounding box for an image classification task, a building classifier. I set the background_ratio to 1 and assumed Label Maker will create a balance classes ratio. But in this case, the bounding box only contained building tiles, and I ended up only have 9 tiles are the background tiles out of 340 tiles. If we can add the class imbalance warning that will help to evaluate the training dataset.

Geoyi avatar Jul 27 '18 15:07 Geoyi

Good catch. I think there are two underlying pieces here to better handle this:

  • For classification problems, we are "correctly" handling background ratio because we try to get enough tiles to match the ratio but if there aren't enough, we can't make new background tiles. We also print out the number of each class. For the case above, should we only generate 9 foreground tiles because there aren't enough background tiles to match or is the class summary sufficient?
  • For object detection + segmentation problems, we aren't properly respecting the background ratio because I had assumed (somewhat erroneously and based on how we implemented skynet-data) that the background information from those tiles would be enough. In reality, it probably creates models which are likely to have false positive problems and needs to have some negative mining done. This needs to be fixed.

drewbo avatar Jul 30 '18 13:07 drewbo