tidb-lightning icon indicating copy to clipboard operation
tidb-lightning copied to clipboard

checksum step fail by chance

Open july2993 opened this issue 5 years ago • 0 comments

Bug Report

Please answer these questions before submitting your issue. Thanks!

  1. What did you do? If possible, provide a recipe for reproducing the error. restore the table order_line of 10k warnhourst
  2. What did you expect to see? checksum success without retry
  3. What did you see instead? checksum fail at first time(this cost about 10m), then retry and success
[2020/03/10 20:44:47.162 +08:00] [INFO] [tidb.go:249] ["alter table auto_increment completed"] [table=`tpcc`.`order_line`] [auto_increment=32650894623] [takeTime=82.342128ms] []
[2020/03/10 20:44:47.162 +08:00] [INFO] [restore.go:1024] ["local checksum"] [table=`tpcc`.`order_line`] [checksum="{cksum=3588732482266855866,size=688778185266,kvs=8601932770}"]
[2020/03/10 20:44:47.169 +08:00] [INFO] [restore.go:1482] ["remote checksum start"] [table=`tpcc`.`order_line`]
[2020/03/10 20:45:52.311 +08:00] [INFO] [restore.go:492] [progress] [files="40/40 (100.0%)"] [tables="0/10 (0.0%)"] [speed(MiB/s)=75.81300532977347] [state=post-processing] []
[2020/03/10 20:50:52.311 +08:00] [INFO] [restore.go:492] [progress] [files="40/40 (100.0%)"] [tables="0/10 (0.0%)"] [speed(MiB/s)=70.75872548328216] [state=post-processing] []
[2020/03/10 20:55:52.311 +08:00] [INFO] [restore.go:492] [progress] [files="40/40 (100.0%)"] [tables="0/10 (0.0%)"] [speed(MiB/s)=66.33624475816293] [state=post-processing] []
[2020/03/10 20:56:23.325 +08:00] [WARN] [util.go:115] ["compute remote checksum failed but going to try again"] [table=`tpcc`.`order_line`] [query="ADMIN CHECKSUM TABLE `tpcc`.`order_line`"] [retryCnt=0] [error="Error 9005: Region is unavailable"]
[2020/03/10 20:56:23.325 +08:00] [WARN] [util.go:106] ["compute remote checksum retry start"] [table=`tpcc`.`order_line`] [query="ADMIN CHECKSUM TABLE `tpcc`.`order_line`"] [retryCnt=1]
[2020/03/10 20:58:07.796 +08:00] [INFO] [restore.go:1496] ["remote checksum completed"] [table=`tpcc`.`order_line`] [takeTime=13m20.626387114s] []
[2020/03/10 20:58:07.802 +08:00] [INFO] [restore.go:1445] ["checksum pass"] [table=`tpcc`.`order_line`] [local="{cksum=3588732482266855866,size=688778185266,kvs=8601932770}"]
[2020/03/10 20:58:07.802 +08:00] [INFO] [restore.go:1450] ["analyze start"] [table=`tpcc`.`order_line`]
[2020/03/10 21:00:52.311 +08:00] [INFO] [restore.go:492] [progress] [files="40/40 (100.0%)"] [tables="0/10 (0.0%)"] [speed(MiB/s)=62.434052438590804] [state=post-processing] []
  1. Versions of the cluster

    • TiDB-Lightning version (run tidb-lightning -V):

      (paste TiDB-Lightning version here)
      
    • TiKV-Importer version (run tikv-importer -V)

      4.0.beta
      
    • TiKV version (run tikv-server -V):

      4.0.beta
      
    • TiDB cluster version (execute SELECT tidb_version(); in a MySQL client):

      4.0.beta
      
    • Other interesting information (system version, hardware config, etc):

  2. Operation logs

    • Please upload tidb-lightning.log for TiDB-Lightning if possible
    • Please upload tikv-importer.log from TiKV-Importer if possible
    • Other interesting logs
  3. Configuration of the cluster and the task

    • tidb-lightning.toml for TiDB-Lightning if possible
    • tikv-importer.toml for TiKV-Importer if possible
    • inventory.ini if deployed by Ansible
  4. Screenshot/exported-PDF of Grafana dashboard or metrics' graph in Prometheus for TiDB-Lightning if possible

july2993 avatar Mar 16 '20 14:03 july2993