Running the inference
Dear authors, thank you for your great work!
I'm trying to run the inference for planning tasks, so I downloaded the pretrained model and split files from your google-drive and adjusted the advqa_t5_elm.yaml config file by changing the following lines:
images: storage: 'data/nuscenes' annotations: train: storage: 'data/drivelm/planning_train.json' test: storage: 'data/drivelm/planning_val.json' val: storage: 'data/drivelm/planning_val.json'
And changed the resume_ckpt_path to load the pretrained model:
resume_ckpt_path: "lavis/output/elm_checkpoint_pretrain.pth"
Finally I enabled the inference as you referred in the README by setting
evaluate: True
All paths are given as absolute paths in my file just for here I changed them to relative paths and as a far as I can tell, there were no issues in loading the split files or the checkpoint file.
When the script now tries to load the checkpoint-file, two errors occur:
RuntimeError: Error(s) in loading state_dict for PeftModel: Missing key(s)in state_dict: "base_model.model.extra_query_tokens", "base_model.model.t5_model.encoder.block.0.layer.0.SelfAttention.q.base_layer.weight", "base_model.model.t5_model.encoder.block.0.layer.0.SelfAttention.v.base_layer.weight",...
and later:
Unexpected key(s) in state_dict: "base_model.model.t5_model.encoder.block.0.layer.0.SelfAttention.q.weight", "base_model.model.t5_model.encoder.block.0.layer.0.SelfAttention.v.weight", "base_model.model.t5_model.encoder.block.1.layer.0.SelfAttention.q.weight",...
I also tried the other checkpoint files (ckpt_elm_drivelm, ckpt_elm_box.pth) and each time those errors occur, only the missing keys and the unexpected keys differ for each of those checkpoints.
Can you help me with this issue? Thank you very much in advance!
Hi, during the training of ELM, different tasks may result in slight variations in the model architecture. The planning task was trained separately, so if you're interested in trying this task, we wouldn't recommend using those checkpoints.
Hello, I would like to test the performance of the traffic signal inquiry task. Which checkpoint should I use and where can I obtain the ckpt file?
Instead of having a dedicated model for each task, we rely on a small set of models that generalize across tasks. If I recall correctly, you can try loading the pretrained checkpoint and using it with the traffic sign inquiry dataloader.
Thanks for your apply!
I use the ckpt_elm_drivelm.pth model on a single A100 GPU with 40GB of memory, but encountered an OOM error when loading the checkpoint. What GPU specifications (e.g., type and VRAM) are required for inference?