Help understanding Table 1 in paper
Hi there
I am trying to understand Table 1 in the paper -- specifically the "Fine-grained_CUB" and "Composition" columns.
For the Fine-grained_CUB column, can you please point me to the script that was used to obtain these results? I can only find the test_domain.sh (col 1) and test_imagenet.sh (col 2) scripts in scripts/
I also wondered if you could please help me understand test_imagenet_composition.sh. From what I can see, this script is identical to test_imagenet.sh. Can you help me understand the difference.
Thanks in advance Daniela
Hi Deniela,
-
Fine-grained CUB requires a lot of engineering efforts to automate the evaluations. Plus it requires too many image generations, which may add some friction. Therefore, we haven't released the scripts to evaluate on CUB. Having said that, if you need it, I can recover the scripts and share them in a separate repository.
-
Overall, we have two evaluation criteria: 1) Concept Alignment and 2) Composition Alignment. Composition alignment is w.r.t. the reference text (with complex compositions). Now, we have two further categories for Concept Alignment:
- Simple concept overfitting, where we have images generated using a simple prompt: "A photo of <V*>"
- Composition concept alignment, where we have images generated using the composite prompt: "A photo of [one/two] <V*> at [X/Y/Z location]"
Hence, both have the same evaluation script but only data changes.