NeRFusion failed to run inference with pretrained weights

Hi,

Thanks for sharing the code. I try to follow the instructions to do inference using Pre-trained Network. But when I tried to run the following command, an error occurs as follows, it looks like the G.ckpt you shared on google drive is invalid. Can you help?

Thank you.

python train.py --dataset_name scannet --root_dir /data/nerf/nerfusion/scene0000_01 --exp_name try_pretrain_scannnet --ckpt_path /data/nerf/nerfusion/G.ckpt

Using 16bit native Automatic Mixed Precision (AMP)
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs
Missing logger folder: logs/scannet/try_pretrain_scannnet
Loading 800 train images ...
100%|████████████████████████████████████████████████| 800/800 [00:07<00:00, 103.75it/s]
Loading 80 test images ...
100%|██████████████████████████████████████████████████| 80/80 [00:00<00:00, 108.76it/s]
Restoring states from the checkpoint path at /data/nerf/nerfusion/G.ckpt
Traceback (most recent call last):
  File "train.py", line 259, in <module>
    trainer.fit(system, ckpt_path=hparams.ckpt_path)
  File "/home/songlin/anaconda3/envs/ngp_pl/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 696, in fit
    self._call_and_handle_interrupt(
  File "/home/songlin/anaconda3/envs/ngp_pl/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 650, in _call_and_handle_interrupt
    return trainer_fn(*args, **kwargs)
  File "/home/songlin/anaconda3/envs/ngp_pl/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 735, in _fit_impl
    results = self._run(model, ckpt_path=self.ckpt_path)
  File "/home/songlin/anaconda3/envs/ngp_pl/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 1110, in _run
    self._restore_modules_and_callbacks(ckpt_path)
  File "/home/songlin/anaconda3/envs/ngp_pl/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 1063, in _restore_modules_and_callbacks
    self._checkpoint_connector.resume_start(checkpoint_path)
  File "/home/songlin/anaconda3/envs/ngp_pl/lib/python3.8/site-packages/pytorch_lightning/trainer/connectors/checkpoint_connector.py", line 85, in resume_start
    self._loaded_checkpoint = self._load_and_validate_checkpoint(checkpoint_path)
  File "/home/songlin/anaconda3/envs/ngp_pl/lib/python3.8/site-packages/pytorch_lightning/trainer/connectors/checkpoint_connector.py", line 89, in _load_and_validate_checkpoint
    loaded_checkpoint = self.trainer.strategy.load_checkpoint(checkpoint_path)
  File "/home/songlin/anaconda3/envs/ngp_pl/lib/python3.8/site-packages/pytorch_lightning/strategies/strategy.py", line 339, in load_checkpoint
    return self.checkpoint_io.load_checkpoint(checkpoint_path)
  File "/home/songlin/anaconda3/envs/ngp_pl/lib/python3.8/site-packages/pytorch_lightning/plugins/io/torch_plugin.py", line 85, in load_checkpoint
    return pl_load(path, map_location=map_location)
  File "/home/songlin/anaconda3/envs/ngp_pl/lib/python3.8/site-packages/pytorch_lightning/utilities/cloud_io.py", line 47, in load
    return torch.load(f, map_location=map_location)
  File "/home/songlin/anaconda3/envs/ngp_pl/lib/python3.8/site-packages/torch/serialization.py", line 705, in load
    with _open_zipfile_reader(opened_file) as opened_zipfile:
  File "/home/songlin/anaconda3/envs/ngp_pl/lib/python3.8/site-packages/torch/serialization.py", line 242, in __init__
    super(_open_zipfile_reader, self).__init__(torch._C.PyTorchFileReader(name_or_buffer))
RuntimeError: PytorchStreamReader failed reading zip archive: failed finding central directory

Oct 19 '22 02:10 songlin

I had the same problem

Oct 25 '22 02:10 jiangxf0929

Hi @songlin @jiangxf0929 , I have met the same issues, have you guys ever solved that? Thanks

Oct 30 '22 15:10 zhao-yiqun

Hi @songlin @jiangxf0929 @zhao-yiqun , I have met the same issue, have you ever solved? Thanks!

Nov 19 '22 14:11 Riser6

Hi @songlin @jiangxf0929 @zhao-yiqun @Riser6 , I've met the same issue. Have you solved it? Thanks.

Dec 21 '22 12:12 Ailon-Island

I had the same problem. I think the provided model checkpoint has been corrupted. Even the simplest command torch.load(f, map_location=map_location) cannot help resume anything from the file.

Dec 22 '22 13:12 Karbo123

Hi,

Thanks for sharing the code. I try to follow the instructions to do inference using Pre-trained Network. But when I tried to run the following command, an error occurs as follows, it looks like the G.ckpt you shared on google drive is invalid. Can you help?

Thank you.

python train.py --dataset_name scannet --root_dir /data/nerf/nerfusion/scene0000_01 --exp_name try_pretrain_scannnet --ckpt_path /data/nerf/nerfusion/G.ckpt

Using 16bit native Automatic Mixed Precision (AMP)
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs
Missing logger folder: logs/scannet/try_pretrain_scannnet
Loading 800 train images ...
100%|████████████████████████████████████████████████| 800/800 [00:07<00:00, 103.75it/s]
Loading 80 test images ...
100%|██████████████████████████████████████████████████| 80/80 [00:00<00:00, 108.76it/s]
Restoring states from the checkpoint path at /data/nerf/nerfusion/G.ckpt
Traceback (most recent call last):
  File "train.py", line 259, in <module>
    trainer.fit(system, ckpt_path=hparams.ckpt_path)
  File "/home/songlin/anaconda3/envs/ngp_pl/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 696, in fit
    self._call_and_handle_interrupt(
  File "/home/songlin/anaconda3/envs/ngp_pl/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 650, in _call_and_handle_interrupt
    return trainer_fn(*args, **kwargs)
  File "/home/songlin/anaconda3/envs/ngp_pl/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 735, in _fit_impl
    results = self._run(model, ckpt_path=self.ckpt_path)
  File "/home/songlin/anaconda3/envs/ngp_pl/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 1110, in _run
    self._restore_modules_and_callbacks(ckpt_path)
  File "/home/songlin/anaconda3/envs/ngp_pl/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 1063, in _restore_modules_and_callbacks
    self._checkpoint_connector.resume_start(checkpoint_path)
  File "/home/songlin/anaconda3/envs/ngp_pl/lib/python3.8/site-packages/pytorch_lightning/trainer/connectors/checkpoint_connector.py", line 85, in resume_start
    self._loaded_checkpoint = self._load_and_validate_checkpoint(checkpoint_path)
  File "/home/songlin/anaconda3/envs/ngp_pl/lib/python3.8/site-packages/pytorch_lightning/trainer/connectors/checkpoint_connector.py", line 89, in _load_and_validate_checkpoint
    loaded_checkpoint = self.trainer.strategy.load_checkpoint(checkpoint_path)
  File "/home/songlin/anaconda3/envs/ngp_pl/lib/python3.8/site-packages/pytorch_lightning/strategies/strategy.py", line 339, in load_checkpoint
    return self.checkpoint_io.load_checkpoint(checkpoint_path)
  File "/home/songlin/anaconda3/envs/ngp_pl/lib/python3.8/site-packages/pytorch_lightning/plugins/io/torch_plugin.py", line 85, in load_checkpoint
    return pl_load(path, map_location=map_location)
  File "/home/songlin/anaconda3/envs/ngp_pl/lib/python3.8/site-packages/pytorch_lightning/utilities/cloud_io.py", line 47, in load
    return torch.load(f, map_location=map_location)
  File "/home/songlin/anaconda3/envs/ngp_pl/lib/python3.8/site-packages/torch/serialization.py", line 705, in load
    with _open_zipfile_reader(opened_file) as opened_zipfile:
  File "/home/songlin/anaconda3/envs/ngp_pl/lib/python3.8/site-packages/torch/serialization.py", line 242, in __init__
    super(_open_zipfile_reader, self).__init__(torch._C.PyTorchFileReader(name_or_buffer))
RuntimeError: PytorchStreamReader failed reading zip archive: failed finding central directory

Can you tell me the version of torch and cuda? I try to configure the environment of NeRFusion, but problems keep appearing. I try to solve them. The problem changes again and again, but it always fails. I would like to ask whether your environment configuration is smooth? Do you have any advice you can give me? Looking forward to your reply

Mar 02 '23 03:03 Bin-ze