Iterator readEntry crashes
What is the issue you are having? Sometimes when using EntryInfoIterator to read all entries from the cache, the readEntry() call crashes with the following stack trace:
runtime error: makeslice: len out of range
panic(0x1fabc00, 0x27ff350)
/usr/local/go/src/runtime/panic.go:975 +0x3e3
github.com/allegro/bigcache.readEntry(...)
/usr/local/Code/src/github.com/allegro/bigcache/encoding.go:36
github.com/allegro/bigcache.(*EntryInfoIterator).Value(0xc10fd94a50, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0)
/usr/local/Code/src/github.com/allegro/bigcache/iterator.go:120 +0x305
We only recently updated from an old version of bigcache. This error seems to indicate the object being read is somehow larger than the maximum slice length, which should not be possible given we marshal the structs to bytes using protobuf's Marshal() method:
func (m *CacheEntry) Marshal() (dAtA []byte, err error) {
size := m.Size()
dAtA = make([]byte, size)
n, err := m.MarshalTo(dAtA)
if err != nil {
return nil, err
}
return dAtA[:n], nil
}
We are running on a 64bit system and we limit the size of all entries to ~1MB anyhow.
What is BigCache doing that it shouldn't?
It is not clear reading the code, but it seems related to how readEntry() determines length, perhaps some kind of overflow issue.
Environment:
- Version (git sha or release): v2.2.1
- OS (e.g. from
/etc/os-releaseor winver.exe): ubuntu 16.04 - go version: 1.14.2
Well that's exciting! Can you provide a quick reproduction of the bug so it's a bit easier to understand the full context? I think I understand where it's breaking, but a code example will help debug it a bit more.
Hey @mxplusb - It's tricky to know how to reproduce it exactly as it is not something that occurs consistently, and it seems that some unusual condition is met sometimes. Here is a mockup that represents roughly what we are doing:
package main
import (
"encoding/binary"
"fmt"
"math/rand"
"time"
"github.com/allegro/bigcache"
)
var cache *bigcache.BigCache
type Entry struct {
id uint32
v []byte
}
func uint32Bytes(v uint32) []byte {
dAtA := make([]byte, 4)
binary.LittleEndian.PutUint32(dAtA, v)
return dAtA
}
func (e Entry) Marshal() []byte {
dAtA := uint32Bytes(e.id)
dAtA = append(dAtA, e.v...)
return dAtA
}
func assignRand(p []byte) {
for i := range p {
p[i] = byte(rand.Int63() & 0xff)
}
}
func seed() {
for i := 0; i < 100000; i++ {
v := make([]byte, rand.Intn(1024))
assignRand(v)
e := Entry{
id: uint32(i),
v: v,
}
cache.Set(string(uint32Bytes(e.id)), e.Marshal())
}
}
func init() {
conf := bigcache.DefaultConfig(60 * 24 * time.Hour)
conf.HardMaxCacheSize = 100 // actually dynamically calculated
conf.MaxEntrySize = 500
var err error
cache, err = bigcache.NewBigCache(conf)
if err != nil {
panic(err)
}
seed()
}
func main() {
rand.Seed(1)
it := cache.Iterator()
var current bigcache.EntryInfo
var err error
for it.SetNext() {
current, err = it.Value()
if err != nil {
panic(fmt.Sprint("iterator value error:", err))
continue
}
fmt.Println("len-value:", len(current.Value()))
}
}```
@mxplusb We have had another crash incident from a different method call:
Caught potentially fatal dispatcher error: runtime error: index out of range [7] with length 4
panic(0x21b9e60, 0xc14db05820)
/usr/local/go/src/runtime/panic.go:969 +0x166
encoding/binary.littleEndian.Uint64(...)
/usr/local/go/src/encoding/binary/binary.go:77
github.com/allegro/bigcache.readHashFromEntry(...)
/usr/local/Code/src/github.com/allegro/bigcache/encoding.go:57
github.com/allegro/bigcache.(*cacheShard).removeOldestEntry(0xc05ceff0e0, 0xc100000002, 0x268, 0x3b3)
/usr/local/Code/src/github.com/allegro/bigcache/shard.go:291 +0x170
github.com/allegro/bigcache.(*cacheShard).set(0xc05ceff0e0, 0xc16b7685a8, 0x8, 0x5beafd6767c17689, 0xc14db53680, 0x24e, 0x24e, 0xffffffffffffffff, 0xc0674e54e0)
/usr/local/Code/src/github.com/allegro/bigcache/shard.go:142 +0x1df
github.com/allegro/bigcache.(*BigCache).Set(0xc000982750, 0xc16b7685a8, 0x8, 0xc14db53680, 0x24e, 0x24e, 0x24e, 0xc14db53680)
/usr/local/Code/src/github.com/allegro/bigcache/bigcache.go:133 +0xac
tamber/distcache/table.(*CacheTable).Set(0xc0002b62a0, 0xc16b7685a8, 0x8, 0xc14db53680, 0x24e, 0x24e, 0x0)
Any ideas?
I think this may be related to #148
This program will panic immediately with error makeslice: len out of range
package main
import (
"bytes"
"github.com/allegro/bigcache/v2"
"log"
"math/rand"
"runtime"
"strconv"
"time"
)
func main() {
cache, _ := bigcache.NewBigCache(bigcache.Config{
Shards: 1,
LifeWindow: time.Second,
MaxEntriesInWindow: 100,
MaxEntrySize: 256,
HardMaxCacheSize: 1,
Verbose: true,
})
go func() {
defer func() {
if err := recover(); err != nil {
buf := make([]byte, 16*1024*1024)
buf = buf[:runtime.Stack(buf, false)]
log.Fatalf("err: %v\n\ncrash: %s\n", err, string(buf))
}
}()
for {
cache.Set(strconv.Itoa(rand.Intn(100)), blob('a', 1024*100))
}
}()
go func() {
defer func() {
if err := recover(); err != nil {
buf := make([]byte, 16*1024*1024)
buf = buf[:runtime.Stack(buf, false)]
log.Fatalf("err: %v\n\ncrash: %s\n", err, string(buf))
}
}()
for {
iter := cache.Iterator()
for iter.SetNext() {
value, err := iter.Value()
log.Printf("key: %v, err: %v", value.Key(), err)
}
}
}()
select {}
}
func blob(char byte, len int) []byte {
return bytes.Repeat([]byte{char}, len)
}
2020/06/03 22:49:22 err: runtime error:
crash: goroutine 19 [running]:
main.main.func2.1()
/Users/walli/Projects/awesomeProject/12.go:42 +0x9e
panic(0x10c04e0, 0x10fac30)
/usr/local/Cellar/go/1.14.3/libexec/src/runtime/panic.go:969 +0x166
github.com/allegro/bigcache/v2.readEntry(...)
/Users/walli/go/pkg/mod/github.com/widelee/bigcache/[email protected]/encoding.go:36
github.com/allegro/bigcache/v2.(*EntryInfoIterator).Value(0xc0000b80a0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0)
/Users/walli/go/pkg/mod/github.com/widelee/bigcache/[email protected]/iterator.go:120 +0x305
main.main.func2(0xc000092d00)
/Users/walli/Projects/awesomeProject/12.go:51 +0xf1
created by main.main
/Users/walli/Projects/awesomeProject/12.go:38 +0x101
In Iterator, it copy index in SetNext(), but the object of index may change when reusing memory in bytesqueue and it pushes objects in the head of queue.
Why NOT just copy hash value of keys during iterating and get newest index of bytequeue in Value()? For example,
// shards.go
func (s *cacheShard) copyKeys() (keys []uint64, next int) {
s.lock.RLock()
keys = make([]uint64, len(s.hashmap))
for key := range s.hashmap {
keys[next] = key
next++
}
s.lock.RUnlock()
return keys, next
}
// iterator.go
// Value returns current value from the iterator
func (it *EntryInfoIterator) Value() (EntryInfo, error) {
it.mutex.Lock()
if !it.valid {
it.mutex.Unlock()
return emptyEntryInfo, ErrInvalidIteratorState
}
wrappedEntry, err := it.cache.shards[it.currentShard].getWrappedEntryWithLock(it.elementKeys[it.currentIndex])
if err != nil {
it.mutex.Unlock()
return emptyEntryInfo, err
}
it.mutex.Unlock()
return EntryInfo{
timestamp: readTimestampFromEntry(wrappedEntry),
hash: readHashFromEntry(wrappedEntry),
key: readKeyFromEntry(wrappedEntry),
value: readEntry(wrappedEntry),
}, nil
}
@mxplusb any hope for a fix here? See a new issue #226 with a similar problem.
I don't actively maintain this project anymore (I was Allegro's first outside collaborator for many years), so I personally won't be providing a fix, I just review community contributions and help make it easier for others to provide a fix if they choose to. Sorry!
Oh cool, we've been happy users for years but it does seem there are quite a few issues now related to this. If there is any way to get more attention to it or if there is a better place to get an update from the contributor community that would certainly be appreciated.
@WideLee would you like to contribute a patch?
@janisz Sure ^_^ I want to change copyKeys() to copy keys' hash instead of copy keys' index in BytesQueue, is anything problem?