mm-cot
mm-cot copied to clipboard
Multi-label classification does not work.
Trying to use it as a zero-shot image classification problem. An image where both Adidas and Nike are available, and text input includes ["Adidas", "Nike"], the output is "Adidas". Ideally, both text labels should have been picked up.