NeMo Eval_beamsearch_ngram_ctc throws got an unexpected keyword argument 'logprobs'

Unable to use KenLM rescore due to missing logprobs on transcribe.

Steps/Code to reproduce the bug

Cloned the repo 7916269.
Used script scripts/asr_language_modeling/ngram_lm/eval_beamsearch_ngram_ctc.py.

It throws the following error: TypeError: EncDecCTCModel.transcribe() got an unexpected keyword argument 'logprobs'.

Applied the change suggested on https://github.com/NVIDIA/NeMo/issues/8884#issuecomment-2049063944.

Logits_hyps = asr_model.transcribe(
    self.audio_file_list, batch_size=self.asr_batch_size, return_hypotheses=True
)  # type: List[nemo_asr.parts.Hypothesis]
            
Logits = [hyp.alignments for hyp in Logits_hyps]

With this change, a new error occurs:

[NeMo I 2024-08-13 19:24:30 ctc_decoding:359] Beam search requires that consecutive CTC tokens are not folded. 
    Overriding provided value of `fold_consecutive` = True to False
Segmentation fault (core dumped)

I thought it was KenLM failing, so I applied this fix: https://github.com/flashlight/wav2letter/issues/875, but it did not work.

I've searched more in the docs, PR, and comments and discovered the karpnv/beamsearch branch from https://github.com/NVIDIA/NeMo/pull/8428. Using the beamsearch branch, I'm still unable to search for alpha and beta values.

The command:

python eval_beamsearch_ngram_ctc.py \
  model_path=/media/carlos/asr/Conformer-CTC-BPE.nemo \
  dataset_manifest=/media/carlos/asr/asr/supervised/test-files/test-all.json \
  preds_output_folder=preds/ \
  cache_file=null \
  ctc_decoding.beam.kenlm_path=/media/carlos/asr/conformerlm3.binary \
  ctc_decoding.beam.flashlight_cfg.lexicon_path=/media/carlos/asr/conformerlm3.binary.tmp.lexicon \
  ctc_decoding.beam.beam_size=[100,200,500] \
  ctc_decoding.beam.beam_alpha=[1,2,3,4] \
  ctc_decoding.beam.beam_beta=[1,2,3,4] \
  ctc_decoding.strategy=flashlight

Shows this error:

    raise NotImplementedError("Wrong parameter combination")
NotImplementedError: Wrong parameter combination

¿Any updated guide or help? It will be greatly appreciated. I can provide any extra details if needed.

Expected behavior

Being able to search alpha and beta using the generated KenLM. We need those values to use RIVA.

Environment overview (please complete the following information)

Environment location: Bare-metal
Method of NeMo install: From source using ./reinstall.sh

Environment details

OS version: Ubuntu 22.04.3 LTS
PyTorch version: 2.4.0+cu121
Python version: 3.10.12

Additional context

GPU model: Nvidia A100

Aug 16 '24 02:08 carlfm01

The fix from #8884 worked for me. See this commit.

But I am using decoding_strategy="beam" and without KenLM.

Aug 17 '24 10:08 aklemen

The fix from #8884 worked for me. See this commit.

It works only without using KenLM. I need KenLM to find the optimal alpha and beta values to deploy it to RIVA.

Aug 19 '24 05:08 carlfm01

@karpnv can you review this ?

Aug 22 '24 17:08 titu1994

¿Any updates? Still blocked and unable to deploy to RIVA.

Sep 02 '24 07:09 carlfm01

I'm having the same issue. What worked for me was implementing the same changes mentioned by @aklemen and changing _wer in this line to wer. I'm using KenLM with default parameters.

Sep 27 '24 16:09 MedAymenF

I'm having the same issue. What worked for me was implementing the same changes mentioned by @aklemen and changing _wer in this line to wer. I'm using KenLM with default parameters.

Thanks for the suggestion, but I’ve already tried those changes before, and I’m still facing the same issue. It works perfectly in Riva, but I can't get it to work in NeMo.

Sep 29 '24 21:09 carlfm01

Hi, @karpnv. any update regarding this? I'm facing the same issue.

Oct 13 '24 08:10 mehadi92

This issue is stale because it has been open for 30 days with no activity. Remove stale label or comment or this will be closed in 7 days.

Nov 13 '24 01:11 github-actions[bot]

This issue was closed because it has been inactive for 7 days since being marked as stale.

Nov 20 '24 02:11 github-actions[bot]

Any update? It's impossible to test a NeMo model before deploying it on Riva using NeMo tools first. Is NeMo still under maintenance?

Feb 09 '25 05:02 carlfm01