tokenizer icon indicating copy to clipboard operation
tokenizer copied to clipboard

panic: runtime error: invalid memory address or nil pointer dereference

Open runarheggset opened this issue 2 years ago • 0 comments

I encountered a panic while encoding some documents. Unfortunately I can't provide the documents, as they are private.

After a quick look, it seems that pairEncoding in util.go:108 is nil, so the GetIds() call fails.

panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x8 pc=0x7a8854]

goroutine 295 [running]:
github.com/sugarme/tokenizer.(*Encoding).GetIds(...)
        /home/superman/go/pkg/mod/github.com/sugarme/[email protected]/encoding.go:215
github.com/sugarme/tokenizer.TruncateEncodings(0xc00013e270, 0x0, 0x350?)
        /home/superman/go/pkg/mod/github.com/sugarme/[email protected]/util.go:108 +0x54
github.com/sugarme/tokenizer.(*Tokenizer).PostProcess(0xc0000fa000, 0xc00013e270?, 0x0?, 0x1)
        /home/superman/go/pkg/mod/github.com/sugarme/[email protected]/tokenizer.go:602 +0xef
github.com/sugarme/tokenizer.(*Tokenizer).Encode(0x0?, {0x847520?, 0xc000025c40?}, 0x0?)
        /home/superman/go/pkg/mod/github.com/sugarme/[email protected]/tokenizer.go:464 +0x2e5
github.com/sugarme/tokenizer.(*Tokenizer).EncodeBatch.func1(0xdc)
        /home/superman/go/pkg/mod/github.com/sugarme/[email protected]/tokenizer.go:647 +0x90
created by github.com/sugarme/tokenizer.(*Tokenizer).EncodeBatch in goroutine 42
        /home/superman/go/pkg/mod/github.com/sugarme/[email protected]/tokenizer.go:644 +0xea

runarheggset avatar Jan 30 '24 21:01 runarheggset