Dimension mismatch while loading model from checkpoint
Open
Rajrup
opened this issue 2 years ago
•
4 comments
Thanks for sharing this great work!
I am currently hitting an issue while running the evaluation for the pointgroup detector using the checkpoint file you shared.
python scripts/eval.py --folder <output_folder> --task detection
Output:
Traceback (most recent call last):
File "scripts/eval.py", line 522, in
model = init_model(cfg, dataset)
File "scripts/eval.py", line 121, in init_model
model.load_state_dict(checkpoint["state_dict"], strict=False)
File "/home/rajrup/miniconda3/envs/d3net-original/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1406, in load_state_dict
raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for PipelineNet:
size mismatch for embeddings: copying a param with shape torch.Size([3441, 300]) from checkpoint, the shape in current model is torch.Size([3535, 300]).
size mismatch for speaker.caption.embeddings: copying a param with shape torch.Size([3441, 300]) from checkpoint, the shape in current model is torch.Size([3535, 300]).
size mismatch for speaker.caption.classifier.2.weight: copying a param with shape torch.Size([3441, 512]) from checkpoint, the shape in current model is torch.Size([3535, 512]).
size mismatch for speaker.caption.classifier.2.bias: copying a param with shape torch.Size([3441]) from checkpoint, the shape in current model is torch.Size([3535]).
The dimension of the tensors in checkpoint doesn't match the one required in the code. Before the model load step, the val splits, and the vocabulary loads fine. I might be missing something here. Can you please help me solve this issue?
I meet the very similar error while running the evaluation for the pointgroup captioning.
Here is the error:
Traceback (most recent call last):
File "scripts/eval.py", line 523, in
model = init_model(cfg, dataset)
File "scripts/eval.py", line 122, in init_model
model.load_state_dict(checkpoint["state_dict"], strict=False)
File "/home/niexing/anaconda3/envs/D3Net/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1407, in load_state_dict
self.class.name, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for PipelineNet:
size mismatch for embeddings: copying a param with shape torch.Size([3441, 300]) from checkpoint, the shape in current model is torch.Size([3433, 300]).
size mismatch for speaker.caption.embeddings: copying a param with shape torch.Size([3441, 300]) from checkpoint, the shape in current model is torch.Size([3433, 300]).
size mismatch for speaker.caption.classifier.2.weight: copying a param with shape torch.Size([3441, 512]) from checkpoint, the shape in current model is torch.Size([3433, 512]).
size mismatch for speaker.caption.classifier.2.bias: copying a param with shape torch.Size([3441]) from checkpoint, the shape in current model is torch.Size([3433]).
@CurryYuan Thank you very much for your help! I have fixed the code as you state. But when I run the evaluation as python scripts/eval.py --folder <output_folder> --task detection, I meet a new issue as follow:
Output:
Could not import cythonized box intersection. Consider compiling box_intersection.pyx for faster training.
=> loading configurations...
=> initializing data...
=> loading train split...
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 562/562 [00:34<00:00, 16.37it/s]
building vocabulary...
Traceback (most recent call last):
File "scripts/eval.py", line 519, in
dataset, dataloader = init_data(cfg)
File "scripts/eval.py", line 82, in init_data
cap_train_dataset = Dataset(cfg, cfg.general.dataset, mode, "train", raw_train, raw_train_scan_list, SCAN2CAD)
File "./lib/dataset/pipeline.py", line 61, in init
self._load()
File "./lib/dataset/pipeline.py", line 392, in _load
self.lang, self.lang_ids = self._tranform_des(self.max_des_len)
File "./lib/dataset/pipeline.py", line 550, in _tranform_des
embeddings[token_id] = self.glove[glove_id]
IndexError: index 3433 is out of bounds for axis 0 with size 3433
I would be very appreciated if you can help me. Thank you very much!