tiktoken icon indicating copy to clipboard operation
tiktoken copied to clipboard

Error: null pointer passed to rust

Open emeeks opened this issue 2 years ago • 4 comments

TikToken was working fine encoding and decoding tokens for me then without any changes (just demoing what was working before) it failed with this error:

Unhandled Runtime Error
Error: null pointer passed to rust

Source
src/components/patterns/Sidebar/AISidebarPanel/generateContext.ts (134:19) @ encode

  132 | }
  133 | 
> 134 | let tokens = enc.encode(fullContext);

From the console:

tiktoken_bg.js:437 Uncaught (in promise) Error: null pointer passed to rust
    at __wbindgen_throw (tiktoken_bg.js:437:1)
    at c3f4a228ecb7380f.wasm:0x7a33d
    at c3f4a228ecb7380f.wasm:0x7a323
    at c3f4a228ecb7380f.wasm:0x731b3
    at c3f4a228ecb7380f.wasm:0x4cae3
    at Tiktoken.encode (tiktoken_bg.js:284:1)

In this case, fullContext is just a long string that is not null or undefined. What's very strange is it was just working fine yesterday and now does this. I tried fresh installs and everything but any tips would be appreciated.

emeeks avatar Sep 06 '23 21:09 emeeks

Have the same problem. Input is a string with ~ 20.000 characters. My workaround is to .free() the encoder and try it again.

EDIT: No that does not work. I just create a new encoder. If I .free() the old one the new one has the same "null pointer passed to rust" error

sean-nicholas avatar Sep 18 '23 15:09 sean-nicholas

I debugged this a bit and I can reproduce the error if I .free() the encoding before calling encode. I get the same error message:

const encoding = get_encoding(encoding)
encoding.free() 
encoding.encode(content) // <-- throws here with "null pointer passed to rust"

Calling .free() two times after another produces the same error:

const encoding = get_encoding(encoding)
encoding.free() 
encoding.free() // <-- throws here with "null pointer passed to rust"

I get this error after I refactored my code. I had the issue that calling get_encoding took 50 ms every time I called it. Calculating multiple messages took sometimes up to multiple seconds. So I started caching the encoding in a Map() and reused it. Maybe some........

Oh my god 🤦‍♂️ while writing this I realized that I called .free() in another method, too. This method used the cached encoding. So, well my fault :D @emeeks Maybe you have a similar issue?

sean-nicholas avatar Sep 18 '23 15:09 sean-nicholas

Hi, I'm having the same issue. "null pointer passed to rust" for a regular string. Does anyone have any workarounds?

alimoezzi avatar May 23 '24 21:05 alimoezzi

I get the same error when I reuse the encoder after calling free(). I had a server that polled for encoding jobs and I'd implemented it like this:

const encoder = get_encoding("cl100k_base"); // global encoder

while (true) { // get next batch of docs to encode
    for (const doc of docs) {
        encoded = encoder.encode(doc.content); // throws null ptr error after the first iteration of outer loop

    }
    encoder.free(); // free the encoder
}

There's some good discussion related to free() on this thread: https://github.com/dqbd/tiktoken/issues/72

I essentially ended up respawning the encoder after freeing it, which resolved the issue. For example:

while (true) { // get next batch of docs to encode
    const encoder = get_encoding("cl100k_base"); // spawn a new encoder
    for (const doc of docs) {
        encoded = encoder.encode(doc.content);
    }
    encoder.free(); // free the encoder
}

ramnique avatar Sep 30 '24 10:09 ramnique