Kanchana Ranasinghe issues

Results 5 issues of


                                            Kanchana Ranasinghe

dependencies

What exact versions of tensorflow and pytorch did you use for this? I can't seem to get meta-dataset and this code base both working at once.

HF text-tower gives NaN with amp / fp16

Training models with HF text-towers gives a NaN. The NaN in loss comes from the first step. ``` torchrun --nproc_per_node 4 -m training.main \ --train-data "${ROOT}/data/cc12m/{00000..01242}.tar" \ --train-num-samples 10968539 \...

Unexpected outputs in newer version of model (v1.5)

### Question I use prompts like `List the 10 main objects found in this image. Output the category names as a list of strings.` and on some images, it produces...

About segmentation outputs

Are segmentation outputs (coordinates) directly predicted from network as floating point numbers under next token prediction loss? This part is quite unclear in the paper. Or are they regressed (using...

Evaluating LLava: batch size > 1?

Good job on the really insightful and useful paper! The quantitative metrics are really useful when working with these models. **Question:** When evaluating the LLava baseline (Table 4 in paper),...