Kirill Gelvan
Kirill Gelvan
Oh, that's unfortunate.. If you are looking for a model in Russian you can try these ones: [https://huggingface.co/DeepPavlov/rudialogpt3_medium_based_on_gpt2_v2](https://huggingface.co/DeepPavlov/rudialogpt3_medium_based_on_gpt2_v2 ) [https://huggingface.co/tinkoff-ai/ruDialoGPT-medium](https://huggingface.co/tinkoff-ai/ruDialoGPT-medium) Or search here [https://huggingface.co/models](https://huggingface.co/models). I don't have a copy of...
If you have pairs context-answer you should look into training models like T5/mT5/flan-T5/ru-T5 or other text-2-text models instead of just generators like GPT-2/3 But generally it's better to go with...
Hi, I'll try to explain it in a couple of days. Hope it'll still be useful for you. I received your email btw
Hi, Separable convolutions is a trick described in the paper of QuartzNet. Shortly, it uses less parameters achieving pretty same results (so it makes the model smaller and faster for...
As far as I remember, it can be unclear in the paper about the blocks where sepconvs are used. But we have tried to fully reproduce the paper and the...
Hi folks, do you have anything to share? I am also trying to reproduce pretraining, but it does seem to be very slow (too slow). @getao you mentioned you used...
Wow @toilaluan thanks for the paper!!! The code at your link does not open though. maybe there is a misspelling or it is private?