[Bug]: dn restarts frequently when running sql 'update table set column'.
Is there an existing issue for the same bug?
- [X] I have checked the existing issues.
Branch Name
main
Commit ID
785be884c9f193a1fe17c94e30dae782f96f6776
Other Environment Information
- Hardware parameters:
- OS type:
- Others:
Actual Behavior
job url(Quries:1y): https://github.com/matrixorigin/mo-nightly-regression/actions/runs/7737767923/job/21101946008
跑到这个地方的时候出的问题:
pod状态:
log: http://175.178.192.213:30088/explore?panes=%7B%22AAL%22:%7B%22datasource%22:%22loki%22,%22queries%22:%5B%7B%22refId%22:%22A%22,%22expr%22:%22%7Bnamespace%3D%5C%22branch-big-data-nightly-785be88%5C%22%7D%20%7C%3D%20%60%60%22,%22queryType%22:%22range%22,%22datasource%22:%7B%22type%22:%22loki%22,%22uid%22:%22loki%22%7D,%22editorMode%22:%22builder%22%7D%5D,%22range%22:%7B%22from%22:%22now-6h%22,%22to%22:%22now%22%7D%7D%7D&schemaVersion=1&orgId=1
Expected Behavior
No response
Steps to Reproduce
trigger big data test workflow.
Additional information
No response
还没进展
20240307大数据测试:
{"level":"INFO","time":"2024/03/07 01:47:55.283797 +0000","caller":"logtail/utils.go:2601","msg":"read-all","operation":"read","operand":"checkpoint","size":4355263,"duration":"52.97915ms"} {"level":"INFO","time":"2024/03/07 01:47:55.283833 +0000","caller":"checkpoint/replay.go:218","msg":"replay checkpoint CKP[G][2]ckp 77325, truncate 72325"} gc 9 @1.650s 0%: 0.063+12+0.018 ms clock, 1.0+0.096/50/132+0.29 ms cpu, 243->246->184 MB, 286 MB goal, 0 MB stacks, 0 MB globals, 16 P gc 10 @1.926s 1%: 0.070+95+0.015 ms clock, 1.1+0.34/374/938+0.24 ms cpu, 332->353->341 MB, 369 MB goal, 0 MB stacks, 0 MB globals, 16 P gc 11 @2.563s 1%: 0.066+71+0.016 ms clock, 1.0+0.52/287/739+0.26 ms cpu, 611->626->604 MB, 683 MB goal, 0 MB stacks, 0 MB globals, 16 P gc 12 @3.733s 2%: 0.071+176+0.015 ms clock, 1.1+0.14/704/1933+0.25 ms cpu, 1077->1105->1066 MB, 1210 MB goal, 0 MB stacks, 0 MB globals, 16 P gc 13 @5.700s 2%: 0.069+293+0.016 ms clock, 1.1+0.33/1169/3124+0.26 ms cpu, 1899->1954->1890 MB, 2134 MB goal, 0 MB stacks, 0 MB globals, 16 P {"level":"INFO","time":"2024/03/07 01:48:01.910046 +0000","caller":"checkpoint/replay.go:256","msg":"replay checkpoint CKP[I][2]ckp 77787, truncate 72787"} {"level":"INFO","time":"2024/03/07 01:48:01.916325 +0000","caller":"checkpoint/replay.go:256","msg":"replay checkpoint CKP[I][2]ckp 78253, truncate 73253"} {"level":"INFO","time":"2024/03/07 01:48:01.922584 +0000","caller":"checkpoint/replay.go:256","msg":"replay checkpoint CKP[I][2]ckp 78688, truncate 73688"} {"level":"INFO","time":"2024/03/07 01:48:01.928592 +0000","caller":"checkpoint/replay.go:256","msg":"replay checkpoint CKP[I][2]ckp 79118, truncate 74118"} {"level":"INFO","time":"2024/03/07 01:48:01.934478 +0000","caller":"checkpoint/replay.go:256","msg":"replay checkpoint CKP[I][2]ckp 79644, truncate 74644"} {"level":"INFO","time":"2024/03/07 01:48:02.244139 +0000","caller":"checkpoint/replay.go:256","msg":"replay checkpoint CKP[I][2]ckp 80176, truncate 75176"} {"level":"INFO","time":"2024/03/07 01:48:02.257638 +0000","caller":"checkpoint/replay.go:287","msg":"open-tae","operation":"replay","operand":"checkpoint","apply cost":"6.973820003s","read cost":"212.562626ms","total count":13,"read count":7,"apply count":7} {"level":"INFO","time":"2024/03/07 01:48:02.257718 +0000","caller":"db/open.go:150","msg":"open-tae","operation":"replay","operand":"checkpoints","cost":"7.186504007s","checkpointed":"1709753681761467295-0"} {"level":"INFO","time":"2024/03/07 01:48:02.258982 +0000","caller":"logservicedriver/truncate.go:116","msg":"Logservice Driver: Get Truncate 75633"} {"level":"INFO","time":"2024/03/07 01:48:02.259032 +0000","caller":"logservicedriver/replay.go:62","msg":"truncated 75633"} gc 14 @9.193s 2%: 0.075+434+0.015 ms clock, 1.2+0.25/1733/4751+0.25 ms cpu, 3365->3454->3085 MB, 3781 MB goal, 0 MB stacks, 0 MB globals, 16 P {"level":"INFO","time":"2024/03/07 01:48:05.029005 +0000","caller":"blockio/pipeline.go:495","msg":"SelectivityStats: BLK[0/0=0.0000] COL[0/0=0.0000] RDF[0/0=0.0000,0/0=0.0000]RDD[0s/0s/0s/0]"} {"level":"INFO","time":"2024/03/07 01:48:05.029079 +0000","caller":"blockio/pipeline.go:501","msg":"MetaCacheWindow: 208/217 | 208/230, MetaCacheTotal: 208/217 | 208/230"} gc 15 @13.663s 2%: 0.077+510+0.030 ms clock, 1.2+0.28/2042/5554+0.48 ms cpu, 5519->5613->3344 MB, 6171 MB goal, 0 MB stacks, 0 MB globals, 16 P panic: runtime error: invalid memory address or nil pointer dereference [signal SIGSEGV: segmentation violation code=0x1 addr=0x18 pc=0x1f90121]
goroutine 669 [running]: github.com/matrixorigin/matrixone/pkg/vm/engine/tae/catalog.(*Catalog).replayObjectByBlock(0x0?, 0xc007d0bd00, {0x1, 0x8e, 0x15, 0x47, 0x27, 0x33, 0x78, 0x61, ...}, ...) /go/src/github.com/matrixorigin/matrixone/pkg/vm/engine/tae/catalog/catalog.go:689 +0x1c1 github.com/matrixorigin/matrixone/pkg/vm/engine/tae/catalog.(*Catalog).onReplayUpdateBlock(0x0?, 0xc01c979b30, {0x45263e8, 0xc002b5a738}, {0x450f7a0, 0xc000808280}) /go/src/github.com/matrixorigin/matrixone/pkg/vm/engine/tae/catalog/catalog.go:723 +0x32c github.com/matrixorigin/matrixone/pkg/vm/engine/tae/catalog.(*Catalog).ReplayCmd(0x0?, {0x457e888, 0xc01c979b30}, {0x45263e8, 0xc002b5a738}, {0x450f7a0, 0xc000808280}) /go/src/github.com/matrixorigin/matrixone/pkg/vm/engine/tae/catalog/catalog.go:215 +0x179 github.com/matrixorigin/matrixone/pkg/vm/engine/tae/txn/txnimpl.(*replayTxnStore).prepareCmd(0xc113ab0320, {0x457e888, 0xc01c979b30}) /go/src/github.com/matrixorigin/matrixone/pkg/vm/engine/tae/txn/txnimpl/replaystore.go:114 +0x2d5 github.com/matrixorigin/matrixone/pkg/vm/engine/tae/txn/txnimpl.(*replayTxnStore).prepareCommit(0xc113ab0320, {0x45d5620, 0xc0aab17f80}) /go/src/github.com/matrixorigin/matrixone/pkg/vm/engine/tae/txn/txnimpl/replaystore.go:82 +0xb8 github.com/matrixorigin/matrixone/pkg/vm/engine/tae/txn/txnbase.(*Txn).PrepareCommit(0xc0aab17f80) /go/src/github.com/matrixorigin/matrixone/pkg/vm/engine/tae/txn/txnbase/txn.go:324 +0x79 github.com/matrixorigin/matrixone/pkg/vm/engine/tae/txn/txnbase.(*TxnManager).onPreparCommit(0x1f671a0?, {0x45d5620, 0xc0aab17f80}) /go/src/github.com/matrixorigin/matrixone/pkg/vm/engine/tae/txn/txnbase/txnmgr.go:338 +0x24 github.com/matrixorigin/matrixone/pkg/vm/engine/tae/txn/txnbase.(*TxnManager).onPrepare(0x0?, 0xc05b30b500, {0x0, 0x0, 0x0, 0x0, 0x94, 0xf9, 0xb0, 0xd4, ...}) /go/src/github.com/matrixorigin/matrixone/pkg/vm/engine/tae/txn/txnbase/txnmgr.go:392 +0x2c github.com/matrixorigin/matrixone/pkg/vm/engine/tae/txn/txnbase.(*TxnManager).onPrepare1PC(0xc002b122a0?, 0xc05b30b500?, {0x0, 0x0, 0x0, 0x0, 0x94, 0xf9, 0xb0, 0xd4, ...}) /go/src/github.com/matrixorigin/matrixone/pkg/vm/engine/tae/txn/txnbase/txnmgr.go:417 +0x45 github.com/matrixorigin/matrixone/pkg/vm/engine/tae/txn/txnbase.(*TxnManager).dequeuePreparing(0xc002b122a0, {0xc006970000, 0x1, 0x3e8}) /go/src/github.com/matrixorigin/matrixone/pkg/vm/engine/tae/txn/txnbase/txnmgr.go:511 +0x2b5 github.com/matrixorigin/matrixone/pkg/vm/engine/tae/logstore/sm.(*safeQueue).Start.func1() /go/src/github.com/matrixorigin/matrixone/pkg/vm/engine/tae/logstore/sm/safeq.go:89 +0x1e5 created by github.com/matrixorigin/matrixone/pkg/vm/engine/tae/logstore/sm.(*safeQueue).Start in goroutine 358 /go/src/github.com/matrixorigin/matrixone/pkg/vm/engine/tae/logstore/sm/safeq.go:68 +0xe5
log:http://175.178.192.213:30088/explore?panes=%7B%22xdF%22:%7B%22datasource%22:%22loki%22,%22queries%22:%5B%7B%22refId%22:%22A%22,%22expr%22:%22%7Bnamespace%3D%5C%22branch-big-data-nightly-54ffba1%5C%22%7D%20%7C%3D%20%60%60%22,%22queryType%22:%22range%22,%22datasource%22:%7B%22type%22:%22loki%22,%22uid%22:%22loki%22%7D,%22editorMode%22:%22builder%22%7D%5D,%22range%22:%7B%22from%22:%22now-12h%22,%22to%22:%22now%22%7D%7D%7D&schemaVersion=1&orgId=1
正在验证
fixed by #14855
fixed