fairseq icon indicating copy to clipboard operation
fairseq copied to clipboard

Error reproducing SimulST evaluation

Open anshulwadhawan opened this issue 3 years ago • 0 comments

I am trying to reproduce the SimulST results on MuST-C by following the fairseq/examples/speech_to_text/docs/simulst_mustc_example.md instructions in the Inference and evaluation section. I am using the databin directory and pretrained model checkpoint that are provided. I am getting the following error when I try to run the simuleval command: simuleval --agent ./fairseq/examples/speech_to_text/simultaneous_translation/agents/fairseq_simul_st_agent.py --source ./data/en/input.txt --target ./outputs/simulST/target.txt --data-bin ./data/mustc/databin --config config_st.yaml --model-path ./simulSTmodel/convtransformer_wait5_pre7 --output ./outputs/simulST --scores

2022-10-29 12:59:27 | INFO     | simuleval.scorer | Evaluating on speech
2022-10-29 12:59:27 | INFO     | simuleval.scorer | Source: /nlp/data/awadhawan/s2st/data/en/input.txt
2022-10-29 12:59:27 | INFO     | simuleval.scorer | Target: /nlp/data/awadhawan/s2st/outputs/simulST/target.txt
2022-10-29 12:59:27 | INFO     | simuleval.scorer | Number of sentences: 2
2022-10-29 12:59:27 | INFO     | simuleval.server | Evaluation Server Started (process id 6607). Listening to port 12321
2022-10-29 12:59:30 | WARNING  | simuleval.scorer | Resetting scorer
2022-10-29 12:59:30 | INFO     | simuleval.cli    | Output dir: /nlp/data/awadhawan/s2st/outputs/simulST
2022-10-29 12:59:30 | INFO     | simuleval.cli    | Evaluating FairseqSimulSTAgent (process id 6604) on instances from 0 to 1
2022-10-29 12:59:30 | INFO     | simuleval.cli    | Start data writer (process id 6614)
2022-10-29 12:59:37 | INFO     | fairseq.tasks.speech_to_text | dictionary size (spm_unigram10000_st.txt): 10,000
Traceback (most recent call last):
  File "/home1/a/anshulw/miniconda3/envs/cenv/bin/simuleval", line 33, in <module>
    sys.exit(load_entry_point('simuleval', 'console_scripts', 'simuleval')())
  File "/mnt/nlpgridio3/data/awadhawan/s2st/SimulEval/simuleval/cli.py", line 165, in main
    _main(args.client_only)
  File "/mnt/nlpgridio3/data/awadhawan/s2st/SimulEval/simuleval/cli.py", line 192, in _main
    evaluate(args, client, server_process)
  File "/mnt/nlpgridio3/data/awadhawan/s2st/SimulEval/simuleval/cli.py", line 145, in evaluate
    decode(args, client, result_queue, indices)
  File "/mnt/nlpgridio3/data/awadhawan/s2st/SimulEval/simuleval/cli.py", line 102, in decode
    agent = agent_cls(args)
  File "/nlp/data/awadhawan/s2st/fairseq/examples/speech_to_text/simultaneous_translation/agents/fairseq_simul_st_agent.py", line 128, in __init__
    self.load_model_vocab(args)
  File "/nlp/data/awadhawan/s2st/fairseq/examples/speech_to_text/simultaneous_translation/agents/fairseq_simul_st_agent.py", line 225, in load_model_vocab
    self.model = task.build_model(state["cfg"]["model"])
  File "/mnt/nlpgridio3/data/awadhawan/s2st/fairseq/fairseq/tasks/speech_to_text.py", line 110, in build_model
    return super(SpeechToTextTask, self).build_model(args)
  File "/mnt/nlpgridio3/data/awadhawan/s2st/fairseq/fairseq/tasks/fairseq_task.py", line 651, in build_model
    model = models.build_model(args, self)
  File "/mnt/nlpgridio3/data/awadhawan/s2st/fairseq/fairseq/models/__init__.py", line 99, in build_model
    assert model is not None, (
AssertionError: Could not infer model type from Namespace(no_progress_bar=False, log_interval=100, log_format='simple', tensorboard_logdir='/checkpoint/xutaima/iwslt2021/offline_convtransformer_st/en-de-st/mustc_v1.0/en-de/tensorboard/sweep274-v5e573ed9b.convtransformer_simul_trans_espnet.wait5.fixed_pre7.lr0.0005.label_smoothed_cross_entropy.pretrain261.adam.ls0.1.maxtok40000.seed2.config_gcmvn_specaug.ngpu8', wandb_project=None, azureml_logging=False, seed=2, cpu=False, tpu=False, bf16=False, memory_efficient_bf16=False, fp16=False, memory_efficient_fp16=False, fp16_no_flatten_grads=False, fp16_init_scale=128, fp16_scale_window=None, fp16_scale_tolerance=0.0, min_loss_scale=0.0001, threshold_loss_scale=None, user_dir=None, empty_cache_freq=0, all_gather_list_size=16384, model_parallel_size=1, quantization_config_path=None, profile=False, reset_logging=False, suppress_crashes=False, criterion='label_smoothed_cross_entropy', tokenizer=None, bpe='sentencepiece', simul_type='waitk_fixed_pre_decision', optimizer='adam', lr_scheduler='inverse_sqrt', scoring='bleu', task='speech_to_text', num_workers=8, skip_invalid_size_inputs_valid_test=True, max_tokens=40000, batch_size=None, required_batch_size_multiple=8, required_seq_len_multiple=1, dataset_impl=None, data_buffer_size=10, train_subset='train', valid_subset='dev', validate_interval=1, validate_interval_updates=0, validate_after_updates=0, fixed_validation_seed=None, disable_validation=False, max_tokens_valid=40000, batch_size_valid=None, curriculum=0, gen_subset='test', num_shards=1, shard_id=0, distributed_world_size=8, distributed_rank=0, distributed_backend='nccl', distributed_init_method=None, distributed_port=-1, device_id=0, distributed_no_spawn=False, ddp_backend='no_c10d', bucket_cap_mb=25, fix_batches_to_gpus=False, find_unused_parameters=False, fast_stat_sync=False, heartbeat_timeout=-1, broadcast_buffers=False, slowmo_momentum=None, slowmo_algorithm='LocalSGD', localsgd_frequency=3, nprocs_per_node=8, pipeline_model_parallel=False, pipeline_balance=None, pipeline_devices=None, pipeline_chunks=0, pipeline_encoder_balance=None, pipeline_encoder_devices=None, pipeline_decoder_balance=None, pipeline_decoder_devices=None, pipeline_checkpoint='never', zero_sharding='none', arch='convtransformer_simul_trans_espnet', max_epoch=0, max_update=30000, stop_time_hours=0, clip_norm=10.0, sentence_avg=False, update_freq=[1], lr=[0.0005], stop_min_lr=-1.0, use_bmuf=False, save_dir='/checkpoint/xutaima/iwslt2021/offline_convtransformer_st/en-de-st/mustc_v1.0/en-de/checkpoints/sweep274-v5e573ed9b.convtransformer_simul_trans_espnet.wait5.fixed_pre7.lr0.0005.label_smoothed_cross_entropy.pretrain261.adam.ls0.1.maxtok40000.seed2.config_gcmvn_specaug.ngpu8', restore_file='checkpoint_last.pt', finetune_from_model=None, reset_dataloader=False, reset_lr_scheduler=False, reset_meters=False, reset_optimizer=False, optimizer_overrides='{}', save_interval=1, save_interval_updates=0, keep_interval_updates=-1, keep_last_epochs=10, keep_best_checkpoints=-1, no_save=False, no_epoch_checkpoints=False, no_last_checkpoints=False, no_save_optimizer_state=False, best_checkpoint_metric='loss', maximize_best_checkpoint_metric=False, patience=-1, checkpoint_suffix='', checkpoint_shard_count=1, load_checkpoint_on_all_dp_ranks=False, train_monotonic_only=False, data='/private/home/xutaima/data/ast/must_c_1_0/en-de/manifests/st_base', config_yaml='config_gvmcn_specaug.yaml', max_source_positions=6000, max_target_positions=1024, label_smoothing=0.1, report_accuracy=True, ignore_prefix_size=0, sentencepiece_model='???', mass_preservation=True, noise_var=1.0, noise_mean=0.0, noise_type='flat', energy_bias=False, energy_bias_init=-2.0, attention_eps=1e-06, waitk_lagging=5, fixed_pre_decision_ratio=7, fixed_pre_decision_type='average', fixed_pre_decision_pad_threshold=0.3, adam_betas='(0.9, 0.999)', adam_eps=1e-08, weight_decay=0.0, use_old_adam=False, warmup_updates=10000, warmup_init_lr=-1, pad=1, eos=2, unk=3, load_pretrained_encoder_from=None, no_seed_provided=False, encoder_embed_dim=256, encoder_layers=12, encoder_attention_heads=4, decoder_attention_heads=4, _name='convtransformer_simul_trans_espnet', input_feat_per_channel=80, input_channels=1, encoder_ffn_embed_dim=2048, encoder_normalize_before=False, decoder_embed_dim=256, decoder_ffn_embed_dim=2048, decoder_layers=6, decoder_normalize_before=False, decoder_learned_pos=False, attention_dropout=0.0, activation_dropout=0.0, activation_fn='relu', dropout=0.1, adaptive_softmax_cutoff=None, adaptive_softmax_dropout=0, share_decoder_input_output_embed=False, no_token_positional_embeddings=False, adaptive_input=False, decoder_layerdrop=0.0, decoder_output_dim=256, decoder_input_dim=256, no_scale_embedding=False, quant_noise_pq=0, tie_adaptive_weights=False, conv_out_channels=256, load_pretrained_decoder_from=None). Available models: dict_keys(['wav2vec', 'wav2vec2', 'wav2vec_ctc', 'wav2vec_seq2seq', 'hubert', 'hubert_ctc', 'transformer_lm']) Requested model type: convtransformer_simul_trans_espnet

@xutaima Please help me with the solution for this.

anshulwadhawan avatar Oct 29 '22 17:10 anshulwadhawan