DriveBench icon indicating copy to clipboard operation
DriveBench copied to clipboard

Google Drive scripts: incorrect iteration

Open SM20sam opened this issue 5 months ago • 2 comments

Observe the structure of an entry in the dataset for a json file: https://huggingface.co/datasets/drive-bench/arena [ { "scene_token": "da41ecbc644b4915b84bb732e35ebf8c", "frame_token": "7e4c3282bc2a4402b5d1d6705f9eb844", "question_type": "robust_qas", "question": "What is the current type of corruption?: A. Fog. B. Bit error. C. JPEG compression. D. Sensor failure", "answer": "D. Sensor failure", "tag": [ 0 ], "image_path": { "CAM_FRONT": "./CameraCrash/CAM_FRONT/n008-2018-08-30-15-52-26-0400__CAM_FRONT__1535658934012637.jpg", "CAM_FRONT_LEFT": "./CameraCrash/CAM_FRONT_LEFT/n008-2018-08-30-15-52-26-0400__CAM_FRONT_LEFT__1535658934004799.jpg", "CAM_FRONT_RIGHT": "./CameraCrash/CAM_FRONT_RIGHT/n008-2018-08-30-15-52-26-0400__CAM_FRONT_RIGHT__1535658934020482.jpg", "CAM_BACK": "./CameraCrash/CAM_BACK/n008-2018-08-30-15-52-26-0400__CAM_BACK__1535658934037558.jpg", "CAM_BACK_LEFT": "./CameraCrash/CAM_BACK_LEFT/n008-2018-08-30-15-52-26-0400__CAM_BACK_LEFT__1535658934047405.jpg", "CAM_BACK_RIGHT": "./CameraCrash/CAM_BACK_RIGHT/n008-2018-08-30-15-52-26-0400__CAM_BACK_RIGHT__1535658934028113.jpg" } }, {

In google drive script, https://drive.google.com/drive/folders/18p9JRMNEVA-wBaKMJK8S5g5yykPp6kqf, llava1.6_dist.py incorrectly iterate over keys in image_path like "CAM_BACK" instead of actual paths to files

    filenames = batch['images'] # should be image_path
    batch_size = len(filenames)

    assert batch_size == 1, "Currently only support batch size 1"

    # Load images and build image placeholders and multi_modal_data
    image_placeholders = [''] * batch_size
    multi_modal_datas = [dict(image=[]) for _ in range(batch_size)]
    system_prompts = [self.system_prompt] * batch_size

    for idx, sample_filenames in enumerate(filenames):
        # Handle corruption if needed
        image_index = 1
        # Replace system prompt
        system_prompts[idx] = replace_system_prompt(system_prompts[idx], sample_filenames)
        for filename in sample_filenames:
             img_path = filename

In this situation, img_path would be equal to something like "CAM_BACK" and not a path to an image

The actual image path is not extracted like ./CameraCrash/CAM_BACK/n008-2018-08-30-15-52-26-0400__CAM_BACK__1535658934037558.jpg

so this error occurs; [Errno 2] No such file or directory: 'CAM_BACK'

The inference py file for llava1.5 is correct in the github.

SM20sam avatar Aug 05 '25 03:08 SM20sam

Thanks for reporting. The scripts we provided in the Google Drive are an unorganized codebase that we initially used. Thus, we haven't included them in GitHub. We will try to add and refine the inference script of those models into the current codebase.

Daniel-xsy avatar Aug 05 '25 16:08 Daniel-xsy

Thanks for reporting. The scripts we provided in the Google Drive are an unorganized codebase that we initially used. Thus, we haven't included them in GitHub. We will try to add and refine the inference script of those models into the current codebase.

Could you possibliy publish the final correct inference and evaluation codebase for all models mentioned in your paper ?

pzhwuhu avatar Sep 16 '25 07:09 pzhwuhu