Rémi Francis comments

Results 28 comments of


                                            Rémi Francis

Training text on a single line.

It's so this LM can be used with http://www.speech.sri.com/projects/srilm/manpages/hidden-ngram.1.html to add end of sentences. On 13 October 2016 at 17:34, Daniel Povey [email protected] wrote: > Is there a compelling reason...

Training text on a single line.

With my current experiments pocolm still seems to be worth it. Do you think that the efficiency can depend on the size of the training set? Also there is the...

Training text on a single line.

I have trained a trigram on one training set with 1.5G words, and I prune it to about 1M ngrams. On the test sets I get: Pocolm gets 153 ppl...

Training text on a single line.

Btw, when it doesn't print `` in the arpa file, it still prints the number `ngram 1=` as if it was there.

Training text on a single line.

It's when I have the whole training text on one line, and then prune the lm. The srilm results are with Good-Turing.

Multiple dev sets.

Not the most convenient solution when I have many sources with various amount of data, but it'll probably do. What is the impact of the dev set on the final...

Multiple dev sets.

What if there is only one data source? On 13 October 2016 at 17:38, Daniel Povey [email protected] wrote: > The dev set will definitely affect the interpolation weights of the...

cuda out of memory when try when run lattice rescoring with transformer pytorchnn?

@danpovey I've been looking in adding some pruning in the iterative rescore, do you think that `lattice-limit-depth` is a better fit than `lattice-determinize-pruned`? It would make sense, as the number...

cuda out of memory when try when run lattice rescoring with transformer pytorchnn?

I'm rather trying to reach the same WERs as my previous rescoring method without making batches too big. I'm still a bit off in terms of WERs, despite getting batches...

Apply SVD on a trained model and then retrain for 1 epoch

Do you have numbers for training a model of reduced size from scratch (and then training it again for 1 epoch)?