NeMo icon indicating copy to clipboard operation
NeMo copied to clipboard

nemo2riva "TypeError: Can't instantiate abstract class ModelPT with abstract methods list_available_models, setup_training_data, setup_validation_data"

Open fujiawei0724 opened this issue 11 months ago • 2 comments

When I try to convert the model trained using nemo to riva format, i encounter this error:

nemo2riva --out out.riva --key=nemotoriva /home/student/test_nemo/checkpoints/Conformer-CTC-BPE/test/checkpoints/Conformer-CTC-BPE--val_wer=1.3506-epoch=1-last.ckpt INFO: PyTorch version 2.6.0 available. [NeMo W 2025-03-24 13:23:38 classification_models:490] Please use the EncDecSpeakerLabelModel instead of this model. EncDecClassificationModel model is kept for backward compatibility with older models. [NeMo I 2025-03-24 13:23:38 nemo2riva:38] Logging level set to 20 [NeMo I 2025-03-24 13:23:38 convert:36] Restoring NeMo model from '/home/student/test_nemo/checkpoints/Conformer-CTC-BPE/test/checkpoints/Conformer-CTC-BPE--val_wer=1.3506-epoch=1-last.ckpt' GPU available: True (cuda), used: True TPU available: False, using: 0 TPU cores HPU available: False, using: 0 HPUs Trainer(limit_train_batches=1.0) was configured so 100% of the batches per epoch will be used.. Trainer(limit_val_batches=1.0) was configured so 100% of the batches will be used.. Trainer(limit_test_batches=1.0) was configured so 100% of the batches will be used.. Trainer(limit_predict_batches=1.0) was configured so 100% of the batches will be used.. Trainer(val_check_interval=1.0) was configured so validation will run at the end of the training epoch.. [NeMo E 2025-03-24 13:23:38 convert:54] Failed to restore model from NeMo file : /home/student/test_nemo/checkpoints/Conformer-CTC-BPE/test/checkpoints/Conformer-CTC-BPE--val_wer=1.3506-epoch=1-last.ckpt. Please make sure you have the latest NeMo package installed with [all] dependencies. Traceback (most recent call last): File "/home/student/anaconda3/envs/nemo-fine-tune-2/lib/python3.10/tarfile.py", line 1906, in gzopen t = cls.taropen(name, mode, fileobj, **kwargs) File "/home/student/anaconda3/envs/nemo-fine-tune-2/lib/python3.10/tarfile.py", line 1883, in taropen return cls(name, mode, fileobj, **kwargs) File "/home/student/anaconda3/envs/nemo-fine-tune-2/lib/python3.10/tarfile.py", line 1743, in init self.firstmember = self.next() File "/home/student/anaconda3/envs/nemo-fine-tune-2/lib/python3.10/tarfile.py", line 2658, in next raise e File "/home/student/anaconda3/envs/nemo-fine-tune-2/lib/python3.10/tarfile.py", line 2631, in next tarinfo = self.tarinfo.fromtarfile(self) File "/home/student/anaconda3/envs/nemo-fine-tune-2/lib/python3.10/tarfile.py", line 1295, in fromtarfile buf = tarfile.fileobj.read(BLOCKSIZE) File "/home/student/anaconda3/envs/nemo-fine-tune-2/lib/python3.10/gzip.py", line 301, in read return self._buffer.read(size) File "/home/student/anaconda3/envs/nemo-fine-tune-2/lib/python3.10/_compression.py", line 68, in readinto data = self.read(len(byte_view)) File "/home/student/anaconda3/envs/nemo-fine-tune-2/lib/python3.10/gzip.py", line 488, in read if not self._read_gzip_header(): File "/home/student/anaconda3/envs/nemo-fine-tune-2/lib/python3.10/gzip.py", line 436, in _read_gzip_header raise BadGzipFile('Not a gzipped file (%r)' % magic) gzip.BadGzipFile: Not a gzipped file (b'PK')

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "/home/student/anaconda3/envs/nemo-fine-tune-2/bin/nemo2riva", line 8, in sys.exit(nemo2riva()) File "/home/student/anaconda3/envs/nemo-fine-tune-2/lib/python3.10/site-packages/nemo2riva/cli/nemo2riva.py", line 49, in nemo2riva Nemo2Riva(args) File "/home/student/anaconda3/envs/nemo-fine-tune-2/lib/python3.10/site-packages/nemo2riva/convert.py", line 59, in Nemo2Riva raise e File "/home/student/anaconda3/envs/nemo-fine-tune-2/lib/python3.10/site-packages/nemo2riva/convert.py", line 52, in Nemo2Riva model = ModelPT.restore_from(restore_path=nemo_in, trainer=trainer) File "/home/student/test_nemo/NeMo/nemo/core/classes/modelPT.py", line 482, in restore_from instance = cls._save_restore_connector.restore_from( File "/home/student/test_nemo/NeMo/nemo/core/connectors/save_restore_connector.py", line 260, in restore_from loaded_params = self.load_config_and_state_dict( File "/home/student/test_nemo/NeMo/nemo/core/connectors/save_restore_connector.py", line 148, in load_config_and_state_dict members = self._filtered_tar_info(restore_path, filter_fn=filter_fn) File "/home/student/test_nemo/NeMo/nemo/core/connectors/save_restore_connector.py", line 627, in _filtered_tar_info with SaveRestoreConnector._tar_open(tar_path) as tar: File "/home/student/anaconda3/envs/nemo-fine-tune-2/lib/python3.10/contextlib.py", line 135, in enter return next(self.gen) File "/home/student/test_nemo/NeMo/nemo/core/connectors/save_restore_connector.py", line 666, in _tar_open tar = tarfile.open(path2file, tar_header) File "/home/student/anaconda3/envs/nemo-fine-tune-2/lib/python3.10/tarfile.py", line 1853, in open return func(name, filemode, fileobj, **kwargs) File "/home/student/anaconda3/envs/nemo-fine-tune-2/lib/python3.10/tarfile.py", line 1910, in gzopen raise ReadError("not a gzip file") from e tarfile.ReadError: not a gzip file (nemo-fine-tune-2) student@anran:~/test_nemo$ nemo2riva --out out.riva --key=nemotoriva /home/student/test_nemo/checkpoints/Conformer-CTC-BPE/test/checkpoints/Conformer-CTC-BPE.nemo INFO: PyTorch version 2.6.0 available. [NeMo W 2025-03-24 13:23:47 classification_models:490] Please use the EncDecSpeakerLabelModel instead of this model. EncDecClassificationModel model is kept for backward compatibility with older models. [NeMo I 2025-03-24 13:23:47 nemo2riva:38] Logging level set to 20 [NeMo I 2025-03-24 13:23:47 convert:36] Restoring NeMo model from '/home/student/test_nemo/checkpoints/Conformer-CTC-BPE/test/checkpoints/Conformer-CTC-BPE.nemo' GPU available: True (cuda), used: True TPU available: False, using: 0 TPU cores HPU available: False, using: 0 HPUs Trainer(limit_train_batches=1.0) was configured so 100% of the batches per epoch will be used.. Trainer(limit_val_batches=1.0) was configured so 100% of the batches will be used.. Trainer(limit_test_batches=1.0) was configured so 100% of the batches will be used.. Trainer(limit_predict_batches=1.0) was configured so 100% of the batches will be used.. Trainer(val_check_interval=1.0) was configured so validation will run at the end of the training epoch.. [NeMo I 2025-03-24 13:23:47 mixins:181] Tokenizer SentencePieceTokenizer initialized with 128 tokens [NeMo E 2025-03-24 13:23:47 common:529] Model instantiation failed! Target class: nemo.collections.asr.models.ctc_bpe_models.EncDecCTCModelBPE Error(s): trainer constructor argument must be either None or lightning.pytorch.Trainer. But got <class 'pytorch_lightning.trainer.trainer.Trainer'> instead. Traceback (most recent call last): File "/home/student/test_nemo/NeMo/nemo/core/classes/common.py", line 508, in from_config_dict instance = imported_cls(cfg=config, trainer=trainer) File "/home/student/test_nemo/NeMo/nemo/collections/asr/models/ctc_bpe_models.py", line 75, in init super().init(cfg=cfg, trainer=trainer) File "/home/student/test_nemo/NeMo/nemo/collections/asr/models/ctc_models.py", line 59, in init super().init(cfg=cfg, trainer=trainer) File "/home/student/test_nemo/NeMo/nemo/core/classes/modelPT.py", line 84, in init raise ValueError( ValueError: trainer constructor argument must be either None or lightning.pytorch.Trainer. But got <class 'pytorch_lightning.trainer.trainer.Trainer'> instead.

[NeMo E 2025-03-24 13:23:47 convert:54] Failed to restore model from NeMo file : /home/student/test_nemo/checkpoints/Conformer-CTC-BPE/test/checkpoints/Conformer-CTC-BPE.nemo. Please make sure you have the latest NeMo package installed with [all] dependencies. Traceback (most recent call last): File "/home/student/anaconda3/envs/nemo-fine-tune-2/bin/nemo2riva", line 8, in sys.exit(nemo2riva()) File "/home/student/anaconda3/envs/nemo-fine-tune-2/lib/python3.10/site-packages/nemo2riva/cli/nemo2riva.py", line 49, in nemo2riva Nemo2Riva(args) File "/home/student/anaconda3/envs/nemo-fine-tune-2/lib/python3.10/site-packages/nemo2riva/convert.py", line 59, in Nemo2Riva raise e File "/home/student/anaconda3/envs/nemo-fine-tune-2/lib/python3.10/site-packages/nemo2riva/convert.py", line 52, in Nemo2Riva model = ModelPT.restore_from(restore_path=nemo_in, trainer=trainer) File "/home/student/test_nemo/NeMo/nemo/core/classes/modelPT.py", line 482, in restore_from instance = cls._save_restore_connector.restore_from( File "/home/student/test_nemo/NeMo/nemo/core/connectors/save_restore_connector.py", line 260, in restore_from loaded_params = self.load_config_and_state_dict( File "/home/student/test_nemo/NeMo/nemo/core/connectors/save_restore_connector.py", line 182, in load_config_and_state_dict instance = calling_cls.from_config_dict(config=conf, trainer=trainer) File "/home/student/test_nemo/NeMo/nemo/core/classes/common.py", line 530, in from_config_dict raise e File "/home/student/test_nemo/NeMo/nemo/core/classes/common.py", line 522, in from_config_dict instance = cls(cfg=config, trainer=trainer) TypeError: Can't instantiate abstract class ModelPT with abstract methods list_available_models, setup_training_data, setup_validation_data

Is there some suggestion to solve this error, thansk

fujiawei0724 avatar Mar 24 '25 05:03 fujiawei0724

@fujiawei0724

I can successfully train with NeMo NGC image: nvcr.io/nvidia/nemo:24.12, following tutorials

and install this fixed version of nemo2riva: https://github.com/jmayank1511/nemo2riva/tree/trainer_fix pip install -r requirements python setup.py build python setup.py install

and export with command nemo2riva --out '/workspace/riva-tutorials/checkpoints/Conformer-CTC-BPE/test/checkpoints/Conformer-CTC-BPE.riva' --key=nemotoriva '/workspace/riva-tutorials/checkpoints/Conformer-CTC-BPE/test/checkpoints/Conformer-CTC-BPE.nemo' --max-dim 1000

Please give it a try.

vincent-sluo avatar Apr 08 '25 13:04 vincent-sluo

This issue is stale because it has been open for 30 days with no activity. Remove stale label or comment or this will be closed in 7 days.

github-actions[bot] avatar May 09 '25 02:05 github-actions[bot]

This issue was closed because it has been inactive for 7 days since being marked as stale.

github-actions[bot] avatar May 16 '25 02:05 github-actions[bot]