kkapper

Results 12 comments of kkapper

I'm assembling an active case of this with debug logs. The startup looks like this: ``` Databend Metasrv Version: v1.2.632-nightly-125cb429a5-simd(1.81.0-nightly-2024-09-06T04:33:23.592487613Z) Working DataVersion: V003(2024-05-31: Store snapshot in rotbl) Raft Feature set:...

This is not currently taking place after a restore. A restore was performed some time in the past. The cluster runs for a few days, then eventually, we will end...

Here is a small chunk of log data that may be relevant: ``` 2024-09-09T17:09:09.846196Z DEBUG databend_common_meta_raft_store::applier: applier.rs:532 apply: raft-log time: 2024-09-09T13:07:35.229000Z+0000 2024-09-09T17:09:09.846214Z DEBUG databend_common_meta_raft_store::applier: applier.rs:87 apply: entry: T209-N2.11326:normal, log_time_ms: 1725887255229...

``` 2024-09-12T17:07:34.438163Z DEBUG databend_common_meta_raft_store::applier: applier.rs:281 txn_execute_one_condition cond=__fd_table_by_id/48488 == seq(48490) 2024-09-12T17:07:34.438194Z DEBUG databend_common_meta_raft_store::applier: applier.rs:290 txn_execute_one_condition: key: __fd_table_by_id/48488 curr: seq:0 value:None 2024-09-12T17:07:34.438218Z DEBUG databend_common_meta_raft_store::applier: applier.rs:348 txn execute TxnOp op=Get(Get key=__fd_table_by_id/48488) 2024-09-12T17:07:34.438248Z DEBUG...

``` check if my service is running and run commands ID: 0 NAME : plaid-databend-meta initialize leader node Database state already matches or exceeds bootstrap timestamp...initializing normally... 2024-09-13T13:45:29.370981Z INFO databend_meta::entry:...

This is complete startup logs. Until we have an error.

I have some suspicion that the issue here might be related to the way the bootstrap process works. This is the initialization script that is included with all databend meta...

If the nodes have done a failover, node 0 would not be the master node anymore. But we are bootstrapping it as if it was.

This script might need changed to solve which node is the master and initialize it with variables instead of using the node indices.