roachtest: backup-restore/online-restore failed [Failed to receive a commit error on FK when one was expected]
roachtest.backup-restore/online-restore failed with artifacts on master @ 41084720464c4144f64d9ddcb46508b4d762c4e8:
(monitor.go:154).Wait: monitor failure: full command output in run_055712.215775638_n4_COCKROACHRANDOMSEED3.log: COMMAND_PROBLEM: exit status 1
test artifacts and logs in: /artifacts/backup-restore/online-restore/run_1
Parameters:
-
ROACHTEST_arch=amd64 -
ROACHTEST_cloud=gce -
ROACHTEST_coverageBuild=false -
ROACHTEST_cpu=4 -
ROACHTEST_encrypted=false -
ROACHTEST_fs=ext4 -
ROACHTEST_localSSD=true -
ROACHTEST_runtimeAssertionsBuild=false -
ROACHTEST_ssd=0
Same failure on other branches
- #130859 roachtest: backup-restore/online-restore failed [C-test-failure O-roachtest O-robot P-2 T-disaster-recovery branch-release-24.1.5-rc]
- #129187 roachtest: backup-restore/online-restore failed [relation "[239]" does not exist during schemachange workload] [C-test-failure O-roachtest O-robot P-2 T-sql-foundations branch-release-24.2]
This test on roachdash | Improve this report!
Jira issue: CRDB-42499
The schema change workload failed, though it is unclear why. sending to foudnations.
This was expecting a FK violation, but did not get one.
{
"workerId": 2,
"clientTimestamp": "05:57:33.584131",
"ops": [
"BEGIN",
{
"sql": "INSERT INTO public.table_w1_33 (col33_w1_34, col33_w1_35, col33_w1_36, \"Col 33_w1_37\", \"cOl33_w1_38\", col33_w1_39, \"!c\fol3'3_w1_40\", col33_w1_41, col33_w1_42, \"\"\"c😡o%pl33_w1_43\", \"c%vol33\"\"_w1_44\", \"c%pol33\\\\U000D8996_w1_45\", \"col33\"\" _w1_46\", col33_w1_47, col_w1_33_w3_36, col_w1_33_w1_103) VALUES (ARRAY[e't\\\\\\x7f\u003e6\\x11\\x07x\\x1b':::STRING,e'\\n\"':::STRING],false,(-42):::INT8,'\"'::BPCHAR,'D'::BPCHAR,''::STRING,']'::BPCHAR,true,113:::INT8,'\\x06':::BYTES,3614:::OID,90000:::OID,e'\\U0000B56B\\U00022068\\U000E90CB\\U000BE910\\U0000B870\\U0001E065\\U000A7597\\U00057986\\U0007BA9F' COLLATE de_DE,'2009-01-14 13:59:04.000245':::TIMESTAMP,4096:::OID,'13:31:20.343023+09:12':::TIMETZ),(ARRAY[e'\\x15\u003c\\b\\x17\\'S}i':::STRING,e'U\\x17.oX4':::STRING,'':::STRING,e'Q\\x01q\\bt':::STRING,'S!1':::STRING,e'z}\\x7ft\\x0f$\\x1e':::STRING,e's\\x057V9\\x04Q\\n':::STRING,e'\\x1f\u00268me#':::STRING],false,71:::INT8,NULL,'\"'::BPCHAR,''::STRING,'$'::BPCHAR,false,124:::INT8,'\\xe50b502a178bcd5b':::BYTES,1560:::OID,1560:::OID,e'\\u0476\\U000727EF\\U0005B4E3' COLLATE de_DE,'2025-10-18 10:52:35.000972':::TIMESTAMP,25:::OID,'09:26:40.916134+01:15':::TIMETZ)",
"potentialExecErr": "23502,23503,23505,23514"
},
"COMMIT"
],
"expectedExecErrors": "",
"expectedCommitErrors": "23503",
"message": "***FAIL; Failed to receive a commit error when at least one commit error was expected",
"errorState": {
"expectedCommitErrors": [
"23503"
],
@fqazi, would https://github.com/cockroachdb/cockroach/pull/132168 be related to this?
roachtest.backup-restore/online-restore failed with artifacts on master @ 42f40f59cae3c0fd8842e194d6991c951ab4382f:
(monitor.go:149).Wait: monitor failure: backup 3_round-trip-test-backup_cluster: mismatched fingerprints for table tpcc.customer
test artifacts and logs in: /artifacts/backup-restore/online-restore/run_1
Parameters:
-
ROACHTEST_arch=amd64 -
ROACHTEST_cloud=gce -
ROACHTEST_coverageBuild=false -
ROACHTEST_cpu=4 -
ROACHTEST_encrypted=false -
ROACHTEST_fs=ext4 -
ROACHTEST_localSSD=true -
ROACHTEST_runtimeAssertionsBuild=false -
ROACHTEST_ssd=0
Same failure on other branches
- #132634 roachtest: backup-restore/online-restore failed [A-disaster-recovery C-test-failure O-roachtest O-robot T-disaster-recovery branch-release-24.2 release-blocker]
- #130859 roachtest: backup-restore/online-restore failed [C-test-failure O-roachtest O-robot P-2 T-disaster-recovery branch-release-24.1.5-rc]
roachtest.backup-restore/online-restore failed with artifacts on master @ 833dadd212fa4b12b1442ae8e00e85ee80a8cdce:
(monitor.go:149).Wait: monitor failure: backup 1_round-trip-test-backup_cluster: mismatched fingerprints for table tpcc.customer
test artifacts and logs in: /artifacts/backup-restore/online-restore/run_1
Parameters:
-
ROACHTEST_arch=amd64 -
ROACHTEST_cloud=gce -
ROACHTEST_coverageBuild=false -
ROACHTEST_cpu=4 -
ROACHTEST_encrypted=true -
ROACHTEST_fs=ext4 -
ROACHTEST_localSSD=true -
ROACHTEST_runtimeAssertionsBuild=false -
ROACHTEST_ssd=0
Same failure on other branches
- #132634 roachtest: backup-restore/online-restore failed [A-disaster-recovery C-test-failure O-roachtest O-robot T-disaster-recovery branch-release-24.2 release-blocker]
- #130859 roachtest: backup-restore/online-restore failed [C-test-failure O-roachtest O-robot P-2 T-disaster-recovery branch-release-24.1.5-rc]
roachtest.backup-restore/online-restore failed with artifacts on master @ 472ea07a5232c98536293d13bb46cca59f9f2cd0:
(monitor.go:149).Wait: monitor failure: backup 1_round-trip-test-backup_cluster: mismatched fingerprints for table tpcc.customer
test artifacts and logs in: /artifacts/backup-restore/online-restore/run_1
Parameters:
-
ROACHTEST_arch=amd64 -
ROACHTEST_cloud=gce -
ROACHTEST_coverageBuild=false -
ROACHTEST_cpu=4 -
ROACHTEST_encrypted=true -
ROACHTEST_fs=ext4 -
ROACHTEST_localSSD=true -
ROACHTEST_runtimeAssertionsBuild=false -
ROACHTEST_ssd=0
Same failure on other branches
- #132634 roachtest: backup-restore/online-restore failed [A-disaster-recovery C-test-failure O-roachtest O-robot T-disaster-recovery branch-release-24.2 release-blocker]
- #130859 roachtest: backup-restore/online-restore failed [C-test-failure O-roachtest O-robot P-2 T-disaster-recovery branch-release-24.1.5-rc]
roachtest.backup-restore/online-restore failed with artifacts on master @ 472ea07a5232c98536293d13bb46cca59f9f2cd0:
| github.com/cockroachdb/cockroach/pkg/sql/colexec/colexecutils/operator.go:152
| github.com/cockroachdb/cockroach/pkg/sql/colexec/colexecbase.(*castIntStringOp).Next
| github.com/cockroachdb/cockroach/bazel-out/k8-opt/bin/pkg/sql/colexec/colexecbase/cast.eg.go:8670
| github.com/cockroachdb/cockroach/pkg/sql/colexec/colexecutils.(*vectorTypeEnforcer).Next
| github.com/cockroachdb/cockroach/pkg/sql/colexec/colexecutils/operator.go:152
| github.com/cockroachdb/cockroach/pkg/sql/colexec/colexecbase.(*castTimestampStringOp).Next
| github.com/cockroachdb/cockroach/bazel-out/k8-opt/bin/pkg/sql/colexec/colexecbase/cast.eg.go:11894
| github.com/cockroachdb/cockroach/pkg/sql/colexec/colexecutils.(*vectorTypeEnforcer).Next
| github.com/cockroachdb/cockroach/pkg/sql/colexec/colexecutils/operator.go:152
| github.com/cockroachdb/cockroach/pkg/sql/colexec/colexecbase.(*castDecimalStringOp).Next
| github.com/cockroachdb/cockroach/bazel-out/k8-opt/bin/pkg/sql/colexec/colexecbase/cast.eg.go:4609
| github.com/cockroachdb/cockroach/pkg/sql/colexec/colexecutils.(*vectorTypeEnforcer).Next
| github.com/cockroachdb/cockroach/pkg/sql/colexec/colexecutils/operator.go:152
| github.com/cockroachdb/cockroach/pkg/sql/colexec/colexecbase.(*castDecimalStringOp).Next
| github.com/cockroachdb/cockroach/bazel-out/k8-opt/bin/pkg/sql/colexec/colexecbase/cast.eg.go:4609
| github.com/cockroachdb/cockroach/pkg/sql/colexec/colexecutils.(*vectorTypeEnforcer).Next
| github.com/cockroachdb/cockroach/pkg/sql/colexec/colexecutils/operator.go:152
| github.com/cockroachdb/cockroach/pkg/sql/colexec/colexecbase.(*castDecimalStringOp).Next
| github.com/cockroachdb/cockroach/bazel-out/k8-opt/bin/pkg/sql/colexec/colexecbase/cast.eg.go:4609
| github.com/cockroachdb/cockroach/pkg/sql/colexec/colexecutils.(*vectorTypeEnforcer).Next
| github.com/cockroachdb/cockroach/pkg/sql/colexec/colexecutils/operator.go:152
| github.com/cockroachdb/cockroach/pkg/sql/colexec/colexecbase.(*castDecimalStringOp).Next
| github.com/cockroachdb/cockroach/bazel-out/k8-opt/bin/pkg/sql/colexec/colexecbase/cast.eg.go:4609
| github.com/cockroachdb/cockroach/pkg/sql/colexec/colexecutils.(*vectorTypeEnforcer).Next
| github.com/cockroachdb/cockroach/pkg/sql/colexec/colexecutils/operator.go:152
| github.com/cockroachdb/cockroach/pkg/sql/colexec/colexecbase.(*castIntStringOp).Next
| github.com/cockroachdb/cockroach/bazel-out/k8-opt/bin/pkg/sql/colexec/colexecbase/cast.eg.go:8670
| github.com/cockroachdb/cockroach/pkg/sql/colexec/colexecutils.(*vectorTypeEnforcer).Next
| github.com/cockroachdb/cockroach/pkg/sql/colexec/colexecutils/operator.go:152
| github.com/cockroachdb/cockroach/pkg/sql/colexec/colexecbase.(*castIntStringOp).Next
| github.com/cockroachdb/cockroach/bazel-out/k8-opt/bin/pkg/sql/colexec/colexecbase/cast.eg.go:8670
| github.com/cockroachdb/cockroach/pkg/sql/colexec/colexecutils.(*vectorTypeEnforcer).Next
| github.com/cockroachdb/cockroach/pkg/sql/colexec/colexecutils/operator.go:152
| github.com/cockroachdb/cockroach/pkg/sql/colexec.(*defaultBuiltinFuncOperator).Next
| github.com/cockroachdb/cockroach/pkg/sql/colexec/builtin_funcs.go:41
| github.com/cockroachdb/cockroach/pkg/sql/colexec/colexecbase.(*simpleProjectOp).Next
| github.com/cockroachdb/cockroach/pkg/sql/colexec/colexecbase/simple_project.go:119
| github.com/cockroachdb/cockroach/pkg/sql/colflow.(*batchInfoCollector).next
| github.com/cockroachdb/cockroach/pkg/sql/colflow/stats.go:113
| github.com/cockroachdb/cockroach/pkg/sql/colexecerror.CatchVectorizedRuntimeError
| github.com/cockroachdb/cockroach/pkg/sql/colexecerror/error.go:147
| github.com/cockroachdb/cockroach/pkg/sql/colflow.(*batchInfoCollector).Next
| github.com/cockroachdb/cockroach/pkg/sql/colflow/stats.go:121
| github.com/cockroachdb/cockroach/pkg/sql/colexec/colexecbase.(*fnOp).Next
| github.com/cockroachdb/cockroach/pkg/sql/colexec/colexecbase/fn_op.go:27
| github.com/cockroachdb/cockroach/pkg/sql/colexec.(*orderedAggregator).Next
| github.com/cockroachdb/cockroach/pkg/sql/colexec/ordered_aggregator.go:204
Wraps: (2) non-nullable column "customer:c_balance" with no value! Index scanned was "customer_pkey" with the index key columns (c_w_id,c_d_id,c_id) and the values (4,8,2915)
Error types: (1) *withstack.withStack (2) *errutil.leafError
test artifacts and logs in: /artifacts/backup-restore/online-restore/run_1
Parameters:
-
ROACHTEST_arch=amd64 -
ROACHTEST_cloud=gce -
ROACHTEST_coverageBuild=false -
ROACHTEST_cpu=4 -
ROACHTEST_encrypted=false -
ROACHTEST_fs=ext4 -
ROACHTEST_localSSD=true -
ROACHTEST_runtimeAssertionsBuild=false -
ROACHTEST_ssd=0
Same failure on other branches
- #132634 roachtest: backup-restore/online-restore failed [A-disaster-recovery C-test-failure O-roachtest O-robot T-disaster-recovery branch-release-24.2 release-blocker]
- #130859 roachtest: backup-restore/online-restore failed [C-test-failure O-roachtest O-robot P-2 T-disaster-recovery branch-release-24.1.5-rc]
roachtest.backup-restore/online-restore failed with artifacts on master @ 472ea07a5232c98536293d13bb46cca59f9f2cd0:
| github.com/cockroachdb/cockroach/pkg/sql/colexec/colexecutils/operator.go:152
| github.com/cockroachdb/cockroach/pkg/sql/colexec/colexecbase.(*castIntStringOp).Next
| github.com/cockroachdb/cockroach/bazel-out/k8-opt/bin/pkg/sql/colexec/colexecbase/cast.eg.go:8670
| github.com/cockroachdb/cockroach/pkg/sql/colexec/colexecutils.(*vectorTypeEnforcer).Next
| github.com/cockroachdb/cockroach/pkg/sql/colexec/colexecutils/operator.go:152
| github.com/cockroachdb/cockroach/pkg/sql/colexec/colexecbase.(*castTimestampStringOp).Next
| github.com/cockroachdb/cockroach/bazel-out/k8-opt/bin/pkg/sql/colexec/colexecbase/cast.eg.go:11894
| github.com/cockroachdb/cockroach/pkg/sql/colexec/colexecutils.(*vectorTypeEnforcer).Next
| github.com/cockroachdb/cockroach/pkg/sql/colexec/colexecutils/operator.go:152
| github.com/cockroachdb/cockroach/pkg/sql/colexec/colexecbase.(*castDecimalStringOp).Next
| github.com/cockroachdb/cockroach/bazel-out/k8-opt/bin/pkg/sql/colexec/colexecbase/cast.eg.go:4609
| github.com/cockroachdb/cockroach/pkg/sql/colexec/colexecutils.(*vectorTypeEnforcer).Next
| github.com/cockroachdb/cockroach/pkg/sql/colexec/colexecutils/operator.go:152
| github.com/cockroachdb/cockroach/pkg/sql/colexec/colexecbase.(*castDecimalStringOp).Next
| github.com/cockroachdb/cockroach/bazel-out/k8-opt/bin/pkg/sql/colexec/colexecbase/cast.eg.go:4609
| github.com/cockroachdb/cockroach/pkg/sql/colexec/colexecutils.(*vectorTypeEnforcer).Next
| github.com/cockroachdb/cockroach/pkg/sql/colexec/colexecutils/operator.go:152
| github.com/cockroachdb/cockroach/pkg/sql/colexec/colexecbase.(*castDecimalStringOp).Next
| github.com/cockroachdb/cockroach/bazel-out/k8-opt/bin/pkg/sql/colexec/colexecbase/cast.eg.go:4609
| github.com/cockroachdb/cockroach/pkg/sql/colexec/colexecutils.(*vectorTypeEnforcer).Next
| github.com/cockroachdb/cockroach/pkg/sql/colexec/colexecutils/operator.go:152
| github.com/cockroachdb/cockroach/pkg/sql/colexec/colexecbase.(*castDecimalStringOp).Next
| github.com/cockroachdb/cockroach/bazel-out/k8-opt/bin/pkg/sql/colexec/colexecbase/cast.eg.go:4609
| github.com/cockroachdb/cockroach/pkg/sql/colexec/colexecutils.(*vectorTypeEnforcer).Next
| github.com/cockroachdb/cockroach/pkg/sql/colexec/colexecutils/operator.go:152
| github.com/cockroachdb/cockroach/pkg/sql/colexec/colexecbase.(*castIntStringOp).Next
| github.com/cockroachdb/cockroach/bazel-out/k8-opt/bin/pkg/sql/colexec/colexecbase/cast.eg.go:8670
| github.com/cockroachdb/cockroach/pkg/sql/colexec/colexecutils.(*vectorTypeEnforcer).Next
| github.com/cockroachdb/cockroach/pkg/sql/colexec/colexecutils/operator.go:152
| github.com/cockroachdb/cockroach/pkg/sql/colexec/colexecbase.(*castIntStringOp).Next
| github.com/cockroachdb/cockroach/bazel-out/k8-opt/bin/pkg/sql/colexec/colexecbase/cast.eg.go:8670
| github.com/cockroachdb/cockroach/pkg/sql/colexec/colexecutils.(*vectorTypeEnforcer).Next
| github.com/cockroachdb/cockroach/pkg/sql/colexec/colexecutils/operator.go:152
| github.com/cockroachdb/cockroach/pkg/sql/colexec.(*defaultBuiltinFuncOperator).Next
| github.com/cockroachdb/cockroach/pkg/sql/colexec/builtin_funcs.go:41
| github.com/cockroachdb/cockroach/pkg/sql/colexec/colexecbase.(*simpleProjectOp).Next
| github.com/cockroachdb/cockroach/pkg/sql/colexec/colexecbase/simple_project.go:119
| github.com/cockroachdb/cockroach/pkg/sql/colflow.(*batchInfoCollector).next
| github.com/cockroachdb/cockroach/pkg/sql/colflow/stats.go:113
| github.com/cockroachdb/cockroach/pkg/sql/colexecerror.CatchVectorizedRuntimeError
| github.com/cockroachdb/cockroach/pkg/sql/colexecerror/error.go:147
| github.com/cockroachdb/cockroach/pkg/sql/colflow.(*batchInfoCollector).Next
| github.com/cockroachdb/cockroach/pkg/sql/colflow/stats.go:121
| github.com/cockroachdb/cockroach/pkg/sql/colexec/colexecbase.(*fnOp).Next
| github.com/cockroachdb/cockroach/pkg/sql/colexec/colexecbase/fn_op.go:27
| github.com/cockroachdb/cockroach/pkg/sql/colexec.(*orderedAggregator).Next
| github.com/cockroachdb/cockroach/pkg/sql/colexec/ordered_aggregator.go:204
Wraps: (2) non-nullable column "customer:c_balance" with no value! Index scanned was "customer_pkey" with the index key columns (c_w_id,c_d_id,c_id) and the values (0,1,33)
Error types: (1) *withstack.withStack (2) *errutil.leafError
test artifacts and logs in: /artifacts/backup-restore/online-restore/run_1
Parameters:
-
ROACHTEST_arch=amd64 -
ROACHTEST_cloud=gce -
ROACHTEST_coverageBuild=false -
ROACHTEST_cpu=4 -
ROACHTEST_encrypted=true -
ROACHTEST_fs=ext4 -
ROACHTEST_localSSD=true -
ROACHTEST_runtimeAssertionsBuild=false -
ROACHTEST_ssd=0
Same failure on other branches
- #133018 roachtest: backup-restore/online-restore failed [A-disaster-recovery C-test-failure O-roachtest O-robot T-disaster-recovery branch-release-24.3 release-blocker]
- #132634 roachtest: backup-restore/online-restore failed [A-disaster-recovery C-test-failure O-roachtest O-robot T-disaster-recovery branch-release-24.2 release-blocker]
- #130859 roachtest: backup-restore/online-restore failed [C-test-failure O-roachtest O-robot P-2 T-disaster-recovery branch-release-24.1.5-rc]
Note: This build has runtime assertions enabled. If the same failure was hit in a run without assertions enabled, there should be a similar failure without this message. If there isn't one, then this failure is likely due to an assertion violation or (assertion) timeout.
roachtest.backup-restore/online-restore failed with artifacts on master @ eb2d2e19eb29d2747d9e267bd0612a69d066adad:
(monitor.go:149).Wait: monitor failure: backup 1_round-trip-test-backup_cluster: error verifying online restore: backup 1_round-trip-test-backup_cluster: error loading online restored contents: error querying column names for system.settings: dial tcp 34.74.78.82:26257: connect: connection refused
unexpected node event: n3: cockroach process for system interface died (exit code 134)
test artifacts and logs in: /artifacts/backup-restore/online-restore/run_1
Parameters:
-
arch=amd64 -
cloud=gce -
coverageBuild=false -
cpu=4 -
encrypted=false -
fs=ext4 -
localSSD=true -
runtimeAssertionsBuild=true -
ssd=0
Same failure on other branches
- #135790 roachtest: backup-restore/online-restore failed [A-disaster-recovery B-runtime-assertions-enabled C-test-failure O-roachtest O-robot T-disaster-recovery branch-release-24.3.0-rc release-blocker]
- #132634 roachtest: backup-restore/online-restore failed [A-disaster-recovery C-test-failure O-roachtest O-robot P-2 T-disaster-recovery branch-release-24.2]
- #130859 roachtest: backup-restore/online-restore failed [C-test-failure O-roachtest O-robot P-2 T-disaster-recovery branch-release-24.1.5-rc]
the most recent failure is a block checksum error. Since we haven't seen the old failure in a month, I'm retitling this to be about the checksum error and bringing it back to DR and/or storage and calling the foundations/schema change flake closed unless we see it agin
Note: This build has runtime assertions enabled. If the same failure was hit in a run without assertions enabled, there should be a similar failure without this message. If there isn't one, then this failure is likely due to an assertion violation or (assertion) timeout.
roachtest.backup-restore/online-restore failed with artifacts on master @ 5c5c9d6803d47848aa1960dd6642d5f2c1926814:
(monitor.go:149).Wait: monitor failure: backup 2_round-trip-test-backup_database-tpcc: error verifying online restore: backup 2_round-trip-test-backup_database-tpcc: error loading online restored contents: error when running query [SELECT index_name, fingerprint FROM [SHOW EXPERIMENTAL_FINGERPRINTS FROM TABLE restore_2_round_trip_test_backup_database_tpcc_2.new_order] ORDER BY index_name]: pq: internal error while retrieving user account memberships: operation "get-user-session" timed out after 10.001s (given timeout 10s): internal error while retrieving user account: get default settings error: interrupted during singleflight load-value:defaultsettings-roachprod-100-1: context deadline exceeded
unexpected node event: n3: cockroach process for system interface died (exit code 134)
test artifacts and logs in: /artifacts/backup-restore/online-restore/run_1
Parameters:
-
arch=amd64 -
cloud=gce -
coverageBuild=false -
cpu=4 -
encrypted=true -
fs=ext4 -
localSSD=true -
runtimeAssertionsBuild=true -
ssd=0
Same failure on other branches
- #135790 roachtest: backup-restore/online-restore failed [A-disaster-recovery B-runtime-assertions-enabled C-test-failure O-roachtest O-robot T-disaster-recovery branch-release-24.3.0-rc release-blocker]
- #132634 roachtest: backup-restore/online-restore failed [A-disaster-recovery C-test-failure O-roachtest O-robot P-2 T-disaster-recovery branch-release-24.2]
- #130859 roachtest: backup-restore/online-restore failed [C-test-failure O-roachtest O-robot P-2 T-disaster-recovery branch-release-24.1.5-rc]
This issue has multiple T-eam labels. Please make sure it only has one, or else issue synchronization will not work correctly.
:owl: Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf.
Note: This build has runtime assertions enabled. If the same failure was hit in a run without assertions enabled, there should be a similar failure without this message. If there isn't one, then this failure is likely due to an assertion violation or (assertion) timeout.
roachtest.backup-restore/online-restore failed with artifacts on master @ cea3ff5562160a3bf2802da052da2aaa40e1ccc1:
(monitor.go:149).Wait: monitor failure: backup 2_round-trip-test-backup_database-tpcc: error verifying online restore: backup 2_round-trip-test-backup_database-tpcc: error loading online restored contents: error when running query [SELECT index_name, fingerprint FROM [SHOW EXPERIMENTAL_FINGERPRINTS FROM TABLE restore_2_round_trip_test_backup_database_tpcc_2.item] ORDER BY index_name]: read tcp 172.17.0.3:45490 -> 35.237.252.62:26257: i/o timeout
unexpected node event: n3: cockroach process for system interface died (exit code 134)
test artifacts and logs in: /artifacts/backup-restore/online-restore/run_1
Parameters:
-
arch=amd64 -
cloud=gce -
coverageBuild=false -
cpu=4 -
encrypted=true -
fs=ext4 -
localSSD=true -
runtimeAssertionsBuild=true -
ssd=0
Same failure on other branches
- #135790 roachtest: backup-restore/online-restore failed [A-disaster-recovery B-runtime-assertions-enabled C-test-failure O-roachtest O-robot T-disaster-recovery branch-release-24.3.0-rc release-blocker]
- #132634 roachtest: backup-restore/online-restore failed [A-disaster-recovery C-test-failure O-roachtest O-robot P-2 T-disaster-recovery branch-release-24.2]
- #130859 roachtest: backup-restore/online-restore failed [C-test-failure O-roachtest O-robot P-2 T-disaster-recovery branch-release-24.1.5-rc]
roachtest.backup-restore/online-restore failed with artifacts on master @ f717f6bd218121bb5e3376af658545f6bff30c22:
(monitor.go:149).Wait: monitor failure: backup 1_round-trip-test-backup_database-tpcc: error verifying online restore: backup 1_round-trip-test-backup_database-tpcc: error loading online restored contents: error when running query [SELECT index_name, fingerprint FROM [SHOW EXPERIMENTAL_FINGERPRINTS FROM TABLE restore_1_round_trip_test_backup_database_tpcc_1.warehouse] ORDER BY index_name]: pq: internal error while retrieving user account memberships: operation "get-user-session" timed out after 10.001s (given timeout 10s): internal error while retrieving user account: get auth info error: interrupted during singleflight load-value:authinfo-roachprod-8-8: context deadline exceeded
unexpected node event: n2: cockroach process for system interface died (exit code 134)
test artifacts and logs in: /artifacts/backup-restore/online-restore/run_1
Parameters:
-
arch=amd64 -
cloud=gce -
coverageBuild=false -
cpu=4 -
encrypted=false -
fs=ext4 -
localSSD=true -
runtimeAssertionsBuild=false -
ssd=0
Same failure on other branches
- #135790 roachtest: backup-restore/online-restore failed [A-disaster-recovery B-runtime-assertions-enabled C-test-failure O-roachtest O-robot T-disaster-recovery branch-release-24.3.0-rc release-blocker]
- #132634 roachtest: backup-restore/online-restore failed [A-disaster-recovery C-test-failure O-roachtest O-robot P-2 T-disaster-recovery branch-release-24.2]
- #130859 roachtest: backup-restore/online-restore failed [C-test-failure O-roachtest O-robot P-2 T-disaster-recovery branch-release-24.1.5-rc]
roachtest.backup-restore/online-restore failed with artifacts on master @ f717f6bd218121bb5e3376af658545f6bff30c22:
(monitor.go:149).Wait: monitor failure: backup 2_round-trip-test-backup_cluster: error verifying online restore: backup 2_round-trip-test-backup_cluster: error loading online restored contents: error running command (./cockroach sql --certs-dir certs -e "SELECT * FROM [SHOW USERS]" --port {pgport:2}): COMMAND_PROBLEM: exit status 1
unexpected node event: n2: cockroach process for system interface died (exit code 134)
test artifacts and logs in: /artifacts/backup-restore/online-restore/run_1
Parameters:
-
arch=amd64 -
cloud=gce -
coverageBuild=false -
cpu=4 -
encrypted=true -
fs=ext4 -
localSSD=true -
runtimeAssertionsBuild=false -
ssd=0
Same failure on other branches
- #135790 roachtest: backup-restore/online-restore failed [A-disaster-recovery B-runtime-assertions-enabled C-test-failure O-roachtest O-robot T-disaster-recovery branch-release-24.3.0-rc release-blocker]
- #132634 roachtest: backup-restore/online-restore failed [A-disaster-recovery C-test-failure O-roachtest O-robot P-2 T-disaster-recovery branch-release-24.2]
- #130859 roachtest: backup-restore/online-restore failed [C-test-failure O-roachtest O-robot P-2 T-disaster-recovery branch-release-24.1.5-rc]
Can confirm this bug started occurring when columnar blocks were enabled for the backup sst sink in https://github.com/cockroachdb/cockroach/commit/fa505ac51abf87bdc607cf4d6387ce5359b93fa9.
Will get fixed when https://github.com/cockroachdb/pebble/pull/4181 gets merged to Pebble and the pebble ref gets bumped in Cockroach.
Reopening until the pebble bump is in.
@itsbilal thanks for finding and fixing this! Do you plan to backport this to 24.3 (and .2?)?
@msbutler we could backport it to 24.3, however it'd be near-impossible to hit it there seeing as it's hard to turn on columnar blocks there. This bug does not exist on 24.2 (no columnar blocks).
Let's backport it to 24.3.
Note: This build has runtime assertions enabled. If the same failure was hit in a run without assertions enabled, there should be a similar failure without this message. If there isn't one, then this failure is likely due to an assertion violation or (assertion) timeout.
roachtest.backup-restore/online-restore failed with artifacts on master @ 67caf19d3998bb3ca1ada7e3c14486d505b68012:
(monitor.go:149).Wait: monitor failure: full command output in run_072443.565297314_n4_COCKROACHRANDOMSEED4.log: COMMAND_PROBLEM: exit status 1
test artifacts and logs in: /artifacts/backup-restore/online-restore/run_1
Parameters:
-
arch=amd64 -
cloud=gce -
coverageBuild=false -
cpu=4 -
encrypted=true -
fs=ext4 -
localSSD=true -
runtimeAssertionsBuild=true -
ssd=0
Same failure on other branches
- #132634 roachtest: backup-restore/online-restore failed [A-disaster-recovery C-test-failure O-roachtest O-robot P-2 T-disaster-recovery branch-release-24.2]
- #130859 roachtest: backup-restore/online-restore failed [C-test-failure O-roachtest O-robot P-2 T-disaster-recovery branch-release-24.1.5-rc]