Oscar
Oscar copied to clipboard
nocaps inference
what is the input of nocaps inference? the image or the image feature throngthout vinvl?