Ye
Ye
Thanks for the explanations! Just to verfiy, in my case, after I trained delta+deltadelta using `steps/train_deltas.sh`, I also applied LDA+MLLT transformation using `steps/train_lda_mllt.sh --splice-opts "--left-context=3 --right-context=3"`, the default dimension output...
[check_phones_compatible.sh](https://github.com/kaldi-asr/kaldi/blob/master/egs/wsj/s5/utils/lang/check_phones_compatible.sh) ``` #Check the two tables are same or not (except for possible difference in disambiguation symbols). if ! cmp -s
the word symbol table (a.k.a. the `word.txt`) file does not contain the word listed above, which is weird cause the `word.txt` is supposed to be the unique words show up...
We are going to bring everything in transcript to UPPER case (lexicon has all UPPER case already)
The dataset I used is from LDC linguistic data consortium. Schools usually pay for LDC membership, which allow their students to use LDC’s dataset for academic interest. You can go...
A I see. We only used SEAME. I don’t know much about the whereabouts of this putonghua-English code-swtich dataset. You may want to contact the authors listed on the paper...
Contact speechOcean 海天瑞声 maybe. I remember I got in touch with them briefly for the access to this dataset. But it is also at a high price... things might have...
Professor said, this could be induced by misalignment. I executed the `steps/train_deltas.sh` with the following command found in [Kaldi for Dummies](http://kaldi-asr.org/doc/kaldi_for_dummies.html) ``` steps/train_deltas.sh 2000 11000 data/train data/lang exp/mono_ali exp/tri1 ```...