label-maker
label-maker copied to clipboard
add tile ratio warning when there is class imbalance
@drewbo, should we add class imbalance warning?
When I create a bounding box for an image classification task, a building classifier. I set the background_ratio to 1 and assumed Label Maker will create a balance classes ratio. But in this case, the bounding box only contained building tiles, and I ended up only have 9 tiles are the background tiles out of 340 tiles. If we can add the class imbalance warning that will help to evaluate the training dataset.
Good catch. I think there are two underlying pieces here to better handle this:
- For classification problems, we are "correctly" handling background ratio because we try to get enough tiles to match the ratio but if there aren't enough, we can't make new background tiles. We also print out the number of each class. For the case above, should we only generate 9 foreground tiles because there aren't enough background tiles to match or is the class summary sufficient?
- For object detection + segmentation problems, we aren't properly respecting the background ratio because I had assumed (somewhat erroneously and based on how we implemented
skynet-data) that the background information from those tiles would be enough. In reality, it probably creates models which are likely to have false positive problems and needs to have some negative mining done. This needs to be fixed.