ELM icon indicating copy to clipboard operation
ELM copied to clipboard

Running the inference

Open Comax752 opened this issue 10 months ago • 4 comments

Dear authors, thank you for your great work!

I'm trying to run the inference for planning tasks, so I downloaded the pretrained model and split files from your google-drive and adjusted the advqa_t5_elm.yaml config file by changing the following lines: images: storage: 'data/nuscenes' annotations: train: storage: 'data/drivelm/planning_train.json' test: storage: 'data/drivelm/planning_val.json' val: storage: 'data/drivelm/planning_val.json' And changed the resume_ckpt_path to load the pretrained model: resume_ckpt_path: "lavis/output/elm_checkpoint_pretrain.pth" Finally I enabled the inference as you referred in the README by setting evaluate: True All paths are given as absolute paths in my file just for here I changed them to relative paths and as a far as I can tell, there were no issues in loading the split files or the checkpoint file. When the script now tries to load the checkpoint-file, two errors occur: RuntimeError: Error(s) in loading state_dict for PeftModel: Missing key(s)in state_dict: "base_model.model.extra_query_tokens", "base_model.model.t5_model.encoder.block.0.layer.0.SelfAttention.q.base_layer.weight", "base_model.model.t5_model.encoder.block.0.layer.0.SelfAttention.v.base_layer.weight",... and later: Unexpected key(s) in state_dict: "base_model.model.t5_model.encoder.block.0.layer.0.SelfAttention.q.weight", "base_model.model.t5_model.encoder.block.0.layer.0.SelfAttention.v.weight", "base_model.model.t5_model.encoder.block.1.layer.0.SelfAttention.q.weight",...

I also tried the other checkpoint files (ckpt_elm_drivelm, ckpt_elm_box.pth) and each time those errors occur, only the missing keys and the unexpected keys differ for each of those checkpoints.

Can you help me with this issue? Thank you very much in advance!

Comax752 avatar Mar 21 '25 11:03 Comax752

Hi, during the training of ELM, different tasks may result in slight variations in the model architecture. The planning task was trained separately, so if you're interested in trying this task, we wouldn't recommend using those checkpoints.

zhouyunsong avatar Apr 08 '25 06:04 zhouyunsong

Hello, I would like to test the performance of the traffic signal inquiry task. Which checkpoint should I use and where can I obtain the ckpt file?

Miranda0920 avatar Jun 30 '25 08:06 Miranda0920

Instead of having a dedicated model for each task, we rely on a small set of models that generalize across tasks. If I recall correctly, you can try loading the pretrained checkpoint and using it with the traffic sign inquiry dataloader.

zhouyunsong avatar Jul 08 '25 05:07 zhouyunsong

Thanks for your apply!

I use the ckpt_elm_drivelm.pth model on a single A100 GPU with 40GB of memory, but encountered an OOM error when loading the checkpoint. What GPU specifications (e.g., type and VRAM) are required for inference?

Miranda0920 avatar Jul 22 '25 12:07 Miranda0920