tokenizer
tokenizer copied to clipboard
panic: runtime error: invalid memory address or nil pointer dereference
I encountered a panic while encoding some documents. Unfortunately I can't provide the documents, as they are private.
After a quick look, it seems that pairEncoding in util.go:108 is nil, so the GetIds() call fails.
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x8 pc=0x7a8854]
goroutine 295 [running]:
github.com/sugarme/tokenizer.(*Encoding).GetIds(...)
/home/superman/go/pkg/mod/github.com/sugarme/[email protected]/encoding.go:215
github.com/sugarme/tokenizer.TruncateEncodings(0xc00013e270, 0x0, 0x350?)
/home/superman/go/pkg/mod/github.com/sugarme/[email protected]/util.go:108 +0x54
github.com/sugarme/tokenizer.(*Tokenizer).PostProcess(0xc0000fa000, 0xc00013e270?, 0x0?, 0x1)
/home/superman/go/pkg/mod/github.com/sugarme/[email protected]/tokenizer.go:602 +0xef
github.com/sugarme/tokenizer.(*Tokenizer).Encode(0x0?, {0x847520?, 0xc000025c40?}, 0x0?)
/home/superman/go/pkg/mod/github.com/sugarme/[email protected]/tokenizer.go:464 +0x2e5
github.com/sugarme/tokenizer.(*Tokenizer).EncodeBatch.func1(0xdc)
/home/superman/go/pkg/mod/github.com/sugarme/[email protected]/tokenizer.go:647 +0x90
created by github.com/sugarme/tokenizer.(*Tokenizer).EncodeBatch in goroutine 42
/home/superman/go/pkg/mod/github.com/sugarme/[email protected]/tokenizer.go:644 +0xea