D3Net icon indicating copy to clipboard operation
D3Net copied to clipboard

Dimension mismatch while loading model from checkpoint

Open Rajrup opened this issue 2 years ago • 4 comments

Thanks for sharing this great work!

I am currently hitting an issue while running the evaluation for the pointgroup detector using the checkpoint file you shared. python scripts/eval.py --folder <output_folder> --task detection

Output: Traceback (most recent call last): File "scripts/eval.py", line 522, in model = init_model(cfg, dataset) File "scripts/eval.py", line 121, in init_model model.load_state_dict(checkpoint["state_dict"], strict=False) File "/home/rajrup/miniconda3/envs/d3net-original/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1406, in load_state_dict raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format( RuntimeError: Error(s) in loading state_dict for PipelineNet: size mismatch for embeddings: copying a param with shape torch.Size([3441, 300]) from checkpoint, the shape in current model is torch.Size([3535, 300]). size mismatch for speaker.caption.embeddings: copying a param with shape torch.Size([3441, 300]) from checkpoint, the shape in current model is torch.Size([3535, 300]). size mismatch for speaker.caption.classifier.2.weight: copying a param with shape torch.Size([3441, 512]) from checkpoint, the shape in current model is torch.Size([3535, 512]). size mismatch for speaker.caption.classifier.2.bias: copying a param with shape torch.Size([3441]) from checkpoint, the shape in current model is torch.Size([3535]).

The dimension of the tensors in checkpoint doesn't match the one required in the code. Before the model load step, the val splits, and the vocabulary loads fine. I might be missing something here. Can you please help me solve this issue?

Thanks!

Rajrup avatar Feb 22 '23 08:02 Rajrup

I meet the very similar error while running the evaluation for the pointgroup captioning.

Here is the error: Traceback (most recent call last): File "scripts/eval.py", line 523, in model = init_model(cfg, dataset) File "scripts/eval.py", line 122, in init_model model.load_state_dict(checkpoint["state_dict"], strict=False) File "/home/niexing/anaconda3/envs/D3Net/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1407, in load_state_dict self.class.name, "\n\t".join(error_msgs))) RuntimeError: Error(s) in loading state_dict for PipelineNet: size mismatch for embeddings: copying a param with shape torch.Size([3441, 300]) from checkpoint, the shape in current model is torch.Size([3433, 300]). size mismatch for speaker.caption.embeddings: copying a param with shape torch.Size([3441, 300]) from checkpoint, the shape in current model is torch.Size([3433, 300]). size mismatch for speaker.caption.classifier.2.weight: copying a param with shape torch.Size([3441, 512]) from checkpoint, the shape in current model is torch.Size([3433, 512]). size mismatch for speaker.caption.classifier.2.bias: copying a param with shape torch.Size([3441]) from checkpoint, the shape in current model is torch.Size([3433]).

STAR-ALG avatar Apr 25 '23 13:04 STAR-ALG

Delete [:self.max_des_len] here.

https://github.com/daveredrum/D3Net/blob/b505e984cc4b01ea6ed95aa94b7bafa45215f4f4/lib/dataset/pipeline.py#L453

CurryYuan avatar Apr 28 '23 04:04 CurryYuan

@CurryYuan Thank you very much for your help! I have fixed the code as you state. But when I run the evaluation as python scripts/eval.py --folder <output_folder> --task detection, I meet a new issue as follow:

Output:

Could not import cythonized box intersection. Consider compiling box_intersection.pyx for faster training. => loading configurations... => initializing data... => loading train split... 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 562/562 [00:34<00:00, 16.37it/s] building vocabulary... Traceback (most recent call last): File "scripts/eval.py", line 519, in dataset, dataloader = init_data(cfg) File "scripts/eval.py", line 82, in init_data cap_train_dataset = Dataset(cfg, cfg.general.dataset, mode, "train", raw_train, raw_train_scan_list, SCAN2CAD) File "./lib/dataset/pipeline.py", line 61, in init self._load() File "./lib/dataset/pipeline.py", line 392, in _load self.lang, self.lang_ids = self._tranform_des(self.max_des_len) File "./lib/dataset/pipeline.py", line 550, in _tranform_des embeddings[token_id] = self.glove[glove_id] IndexError: index 3433 is out of bounds for axis 0 with size 3433

I would be very appreciated if you can help me. Thank you very much!

STAR-ALG avatar Apr 28 '23 09:04 STAR-ALG

@daveredrum any suggestions will be helpful.

Rajrup avatar Sep 14 '23 18:09 Rajrup