tiktoken-php
tiktoken-php copied to clipboard
This is a port of the tiktoken
Hello, looking around I found an implementation in Java and Node about calculate tokens on functions / tools call. I migrated the code in PHP, maybe useful adding that code...
Hey, thanks for porting this over! I wanted to move to PHP to remove an extra dependency (docker server exposing Python TikToken over API). I decided to do a small...
Hi, I just had feedback from my users that they are experiencing this error: [ error ] [0]No rank for bytes vector: [33][src/Vocab/Vocab.php:120] Since this is a problem found by...
Changes been made to handle a fopen error:0A000086:SSL routines::certificate verify failed error. Downside: Ignoring SSL check for fopen towards OpenAI. Was using the gpt-4o model.
Are there plans to support o1 models: `o1-preview` and `o1-mini`? I suppose they use already existing encodings, so it is not a problem to add support for them.
For example, a long input string that is all letters takes a very long time to run because it ends up as one giant mergeBytePairs call: ``` $input = implode('',...
When tokenizing large input strings, `mergeBytePairs` seems to be the bottleneck even when it isn't [degrading to quadratic[(https://github.com/yethee/tiktoken-php/issues/25). On my workload, I found that a small change of caching the...
Take the given test: ```php $usage = memory()[1]; $provider = new \Yethee\Tiktoken\EncoderProvider; $provider->setVocabCache(storage_path('app')); $encoder = $provider->getForModel('gpt-4o-mini'); dd(memory()[1]-$usage); // 26mb! ``` 26mb seems a bit much no? Especially considering the cached...