Haris Jabbar

Results 4 issues of Haris Jabbar

Hi. I am trying to convert corpora from HF to their IPA form with the following snippet. But I am getting really slow speeds.. only a couple of examples per...

Hi! Thank you for great repo and the models. I want to pretrain the model with a new [tokenizer](https://arxiv.org/abs/2307.07262), but since 16 A100 GPUs are hard to get by, I...

@karpathy Thanks for the great lecture and implementation! As always, it was a pleasure. I have tried to implement LlamaTokenizer (without using sentencepiece backend) staying as close to minbpe implementation...

If I am not mistaken, the sum of PRS_signature values should be equal to Nmorph. However during data exploration I found quite a few entries where these values don't match....