Viacheslav Seledkin

Results 4 issues of Viacheslav Seledkin

Adds support of UTF-8, must be compatible with models created from one-byte encoded inputs (but this is not verified), not compatible with old models created from utf-8 because they have...

Currently, text is generated from latent point by sampling from distributions produced by generator over vocabulary of tokens. But since z is multivariate gaussian we can also sample from it...

This adds UTF-8 support for multibyte encoded corpora

It probably must start from 0? one key and value which is at 0 is always missed when using iterator correction: iterator interface is a bit confusing, right pattern of...