Vamsi Chagari
Vamsi Chagari
Hi @tlikhomanenko, Thank you very much for the response, could you please help me by telling how to convert the per frame token indices to word timings. Please provide an...
Hi @tlikhomanenko, Okay, thank you. I am referring to Decode.cpp. Please find the info below: Flags: Criterion is set to “ctc” Surrond is not set Relabel is not set, Usewordpiece...
Thank you @tlikhomanenko for the response, i appreciate it. I did test the decoder after making the code changes. The word timings I calculated based on the info that's there...
Hi @tlikhomanenko, Thank you for your comments. 1. If I remove the first and last silence tokens as you said, still the word timings of an each individual word is...
Hi @tlikhomanenko, I realized later that you might be referring to the total duration. Thank you for ur comments. Okay, I did actually explore the other Wav2letter recipes and figured...
Thank you for your comments. @tlikhomanenko: Please address my questions below. 1. There are no convolution or pooling layers in the Lexicon-free architecture. So default stride is 1!?. Lexicon_Free arch:...
Hi @tlikhomanenko, I changed the total stride to 3 from 7 in streaming convents recipe architecture and trained it from scratch on Librispeech data with Letter tokens. AM Model seems...
Hi @micahjon , I have a question, Did you able run OpenSeq2Seq on Wav2letter ?, I see that speech2text is using a Deepspeech framework.
Hi @xuqiantong, I referred to your comments in the #400 ticket (https://github.com/flashlight/wav2letter/issues/400#issuecomment-529723436), could you please help me by telling how to get the begin and end time for each word...
Thank you @tlikhomanenko. @vineelpratap, @avidov : Could you please help me converting the model to the new format?.