dgraph icon indicating copy to clipboard operation
dgraph copied to clipboard

Panic observed during applyCh

Open iluminae opened this issue 4 years ago • 1 comments

Github issues are back! Putting in this one now:

Following on topic here: https://discuss.dgraph.io/t/critical-bug-in-v21-12-permanently-crashloops-whole-groups/16383/7

A panic is continually observed here:

2021/12/22 06:56:16 Unable to find txn with start ts: 2178082
github.com/dgraph-io/dgraph/x.AssertTruef
        /ext-go/1/src/github.com/dgraph-io/dgraph/x/error.go:107
github.com/dgraph-io/dgraph/worker.(*node).applyMutations
        /ext-go/1/src/github.com/dgraph-io/dgraph/worker/draft.go:707
github.com/dgraph-io/dgraph/worker.(*node).applyCommitted
        /ext-go/1/src/github.com/dgraph-io/dgraph/worker/draft.go:744
github.com/dgraph-io/dgraph/worker.(*node).processApplyCh.func1
        /ext-go/1/src/github.com/dgraph-io/dgraph/worker/draft.go:931
github.com/dgraph-io/dgraph/worker.(*node).processApplyCh.func2
        /ext-go/1/src/github.com/dgraph-io/dgraph/worker/draft.go:970
github.com/dgraph-io/dgraph/worker.(*node).processApplyCh
        /ext-go/1/src/github.com/dgraph-io/dgraph/worker/draft.go:1025
runtime.goexit
        /usr/local/go/src/runtime/asm_amd64.s:1581

If this panic occurs, the node will restart and replay its WAL - and will encounter the same issue again, permanently disabling the node. This tends to happen to entire groups at once.

v21.12:

$ dgraph version

Dgraph version   : v21.12.0
Dgraph codename  : zion
Dgraph SHA-256   : 078c75df9fa1057447c8c8afc10ea57cb0a29dfb22f9e61d8c334882b4b4eb37
Commit SHA-1     : d62ed5f15
Commit timestamp : 2021-12-02 21:20:09 +0530
Branch           : HEAD
Go version       : go1.17.3
jemalloc enabled : true

iluminae avatar Jan 05 '22 01:01 iluminae

I've also encountered this intermittently during some high throughput tests, same version, same stacktrace. I can't reliably replicate it though.

Environment: Docker on MacOS, image id sha256:e522ce9e32cb2877485f6ab2d7d125fae50907d13c8ceddce5705d52494b0523

Workload:

  • 1x goroutine running upserts in batches of 200 nodes and 100-400 edges
  • 1x goroutine querying all data every second

Library: github.com/dgraph-io/dgo/v200 v200.0.0-20210401091508-95bfd74de60e

dylanratcliffe avatar May 16 '22 09:05 dylanratcliffe