Kavya Manohar
Kavya Manohar
The same error pops up when I try to perform syllabification too. ``` from indicnlp.syllable import syllabifier w='जगदीशचंद्र' lang='hi' syllabifier.orthographic_syllabify(w,lang) ``` It returns ``` line 170, in get_phonetic_feature_vector if phonetic_data.iloc[offset]['Valid...
This is exciting @RuABraun. Looking forward to it. On another note, `align_text()`expecting a StringVector, infact allowed me to pass a custom tokenized hypothesis and reference. In the code snippet above,...
Thanks @nshmyrev. The Malayalam dataset in https://github.com/Open-Speech-EkStep/ULCA-asr-dataset-corpus is currently 'unlabelled'. I think we can not use it unless transcript is available.
Thanks for your time and effort for reviewing it @nshmyrev. The WERs are higher on test datasets where OOV rates are quite high. Test set 1 - 8% WER (1%...