parti
parti copied to clipboard
Localized Narratives Benchmarking
Hi @JiahuiYu / @jasonbaldridge, I was wondering what "oversampling" in this line from the paper meant: "The validation set of the Localized Narratives COCO split contains only 5,000 unique images, so we follow [47] in oversampling the captions to acquire 30,000 generated images.”
Do you sample 6 images for each unique caption to generate 30k images, or do you generate 1 image for some modification of each unique caption (eg take n random sentences)?