cocottributes icon indicating copy to clipboard operation
cocottributes copied to clipboard

Label noise very high

Open duckduck-sys opened this issue 5 years ago • 2 comments

Looking through the annotated samples, I noticed that for the gender and age attributes of the person class, the label noise is extremely high. In the range 20 to 30 percent label errors, see photos attached. The largest sources of errors are:

  • There's a lot of photos of Person class where the gender is simply not assigned even when it's clearly visible (~30%).
  • There's a lot of photos of females that are wrongly labeled as males (~20%).
  • For the age categories (Young, Adult, Old), almost all photos are simply labeled Adult, even when the object in question is clearly either Young or Old. There's substantially more old people without the Old label than old people with the Old label...

I haven't checked the other object categories but i suspect the label noise to be of similar magnitude. It would be nice if it was declared in the Repo that the labels in the data-set suffer from high error rate.

182952 183056 184930

duckduck-sys avatar Dec 18 '20 03:12 duckduck-sys

I agree with you @duckduck-sys I've been working with this dataset for months and I've seen lots of pictures with person in them but picture's label does not mention the person class. The same thing has happened for some other classes as well. I think the cocoAttribute dataset needs some modification to be used on other projects.

BehnazDibayee avatar Jan 12 '21 14:01 BehnazDibayee

I was curious if there are pre-trained PyTorch models available. I do see the training code etc. in the PyTorch directory but no directions to get the trained models.

TouqeerAhmad avatar Jan 25 '21 20:01 TouqeerAhmad