cockroach icon indicating copy to clipboard operation
cockroach copied to clipboard

roachtest: backup-restore/online-restore failed [Failed to receive a commit error on FK when one was expected]

Open cockroach-teamcity opened this issue 1 year ago • 6 comments

roachtest.backup-restore/online-restore failed with artifacts on master @ 41084720464c4144f64d9ddcb46508b4d762c4e8:

(monitor.go:154).Wait: monitor failure: full command output in run_055712.215775638_n4_COCKROACHRANDOMSEED3.log: COMMAND_PROBLEM: exit status 1
test artifacts and logs in: /artifacts/backup-restore/online-restore/run_1

Parameters:

  • ROACHTEST_arch=amd64
  • ROACHTEST_cloud=gce
  • ROACHTEST_coverageBuild=false
  • ROACHTEST_cpu=4
  • ROACHTEST_encrypted=false
  • ROACHTEST_fs=ext4
  • ROACHTEST_localSSD=true
  • ROACHTEST_runtimeAssertionsBuild=false
  • ROACHTEST_ssd=0
Help

See: roachtest README

See: How To Investigate (internal)

See: Grafana

Same failure on other branches

  • #130859 roachtest: backup-restore/online-restore failed [C-test-failure O-roachtest O-robot P-2 T-disaster-recovery branch-release-24.1.5-rc]
  • #129187 roachtest: backup-restore/online-restore failed [relation "[239]" does not exist during schemachange workload] [C-test-failure O-roachtest O-robot P-2 T-sql-foundations branch-release-24.2]
/cc @cockroachdb/disaster-recovery

This test on roachdash | Improve this report!

Jira issue: CRDB-42499

cockroach-teamcity avatar Sep 25 '24 06:09 cockroach-teamcity

The schema change workload failed, though it is unclear why. sending to foudnations.

msbutler avatar Oct 08 '24 14:10 msbutler

This was expecting a FK violation, but did not get one.

{
 "workerId": 2,
 "clientTimestamp": "05:57:33.584131",
 "ops": [
  "BEGIN",
  {
   "sql": "INSERT INTO public.table_w1_33 (col33_w1_34, col33_w1_35, col33_w1_36, \"Col 33_w1_37\", \"cOl33_w1_38\", col33_w1_39, \"!c\fol3'3_w1_40\", col33_w1_41, col33_w1_42, \"\"\"c😡o%pl33_w1_43\", \"c%vol33\"\"_w1_44\", \"c%pol33\\\\U000D8996_w1_45\", \"col33\"\" _w1_46\", col33_w1_47, col_w1_33_w3_36, col_w1_33_w1_103) VALUES (ARRAY[e't\\\\\\x7f\u003e6\\x11\\x07x\\x1b':::STRING,e'\\n\"':::STRING],false,(-42):::INT8,'\"'::BPCHAR,'D'::BPCHAR,''::STRING,']'::BPCHAR,true,113:::INT8,'\\x06':::BYTES,3614:::OID,90000:::OID,e'\\U0000B56B\\U00022068\\U000E90CB\\U000BE910\\U0000B870\\U0001E065\\U000A7597\\U00057986\\U0007BA9F' COLLATE de_DE,'2009-01-14 13:59:04.000245':::TIMESTAMP,4096:::OID,'13:31:20.343023+09:12':::TIMETZ),(ARRAY[e'\\x15\u003c\\b\\x17\\'S}i':::STRING,e'U\\x17.oX4':::STRING,'':::STRING,e'Q\\x01q\\bt':::STRING,'S!1':::STRING,e'z}\\x7ft\\x0f$\\x1e':::STRING,e's\\x057V9\\x04Q\\n':::STRING,e'\\x1f\u00268me#':::STRING],false,71:::INT8,NULL,'\"'::BPCHAR,''::STRING,'$'::BPCHAR,false,124:::INT8,'\\xe50b502a178bcd5b':::BYTES,1560:::OID,1560:::OID,e'\\u0476\\U000727EF\\U0005B4E3' COLLATE de_DE,'2025-10-18 10:52:35.000972':::TIMESTAMP,25:::OID,'09:26:40.916134+01:15':::TIMETZ)",
   "potentialExecErr": "23502,23503,23505,23514"
  },
  "COMMIT"
 ],
 "expectedExecErrors": "",
 "expectedCommitErrors": "23503",
 "message": "***FAIL; Failed to receive a commit error when at least one commit error was expected",
 "errorState": {
  "expectedCommitErrors": [
   "23503"
  ],

@fqazi, would https://github.com/cockroachdb/cockroach/pull/132168 be related to this?

rafiss avatar Oct 08 '24 18:10 rafiss

roachtest.backup-restore/online-restore failed with artifacts on master @ 42f40f59cae3c0fd8842e194d6991c951ab4382f:

(monitor.go:149).Wait: monitor failure: backup 3_round-trip-test-backup_cluster: mismatched fingerprints for table tpcc.customer
test artifacts and logs in: /artifacts/backup-restore/online-restore/run_1

Parameters:

  • ROACHTEST_arch=amd64
  • ROACHTEST_cloud=gce
  • ROACHTEST_coverageBuild=false
  • ROACHTEST_cpu=4
  • ROACHTEST_encrypted=false
  • ROACHTEST_fs=ext4
  • ROACHTEST_localSSD=true
  • ROACHTEST_runtimeAssertionsBuild=false
  • ROACHTEST_ssd=0
Help

See: roachtest README

See: How To Investigate (internal)

See: Grafana

Same failure on other branches

  • #132634 roachtest: backup-restore/online-restore failed [A-disaster-recovery C-test-failure O-roachtest O-robot T-disaster-recovery branch-release-24.2 release-blocker]
  • #130859 roachtest: backup-restore/online-restore failed [C-test-failure O-roachtest O-robot P-2 T-disaster-recovery branch-release-24.1.5-rc]

This test on roachdash | Improve this report!

cockroach-teamcity avatar Oct 17 '24 06:10 cockroach-teamcity

roachtest.backup-restore/online-restore failed with artifacts on master @ 833dadd212fa4b12b1442ae8e00e85ee80a8cdce:

(monitor.go:149).Wait: monitor failure: backup 1_round-trip-test-backup_cluster: mismatched fingerprints for table tpcc.customer
test artifacts and logs in: /artifacts/backup-restore/online-restore/run_1

Parameters:

  • ROACHTEST_arch=amd64
  • ROACHTEST_cloud=gce
  • ROACHTEST_coverageBuild=false
  • ROACHTEST_cpu=4
  • ROACHTEST_encrypted=true
  • ROACHTEST_fs=ext4
  • ROACHTEST_localSSD=true
  • ROACHTEST_runtimeAssertionsBuild=false
  • ROACHTEST_ssd=0
Help

See: roachtest README

See: How To Investigate (internal)

See: Grafana

Same failure on other branches

  • #132634 roachtest: backup-restore/online-restore failed [A-disaster-recovery C-test-failure O-roachtest O-robot T-disaster-recovery branch-release-24.2 release-blocker]
  • #130859 roachtest: backup-restore/online-restore failed [C-test-failure O-roachtest O-robot P-2 T-disaster-recovery branch-release-24.1.5-rc]

This test on roachdash | Improve this report!

cockroach-teamcity avatar Oct 18 '24 06:10 cockroach-teamcity

roachtest.backup-restore/online-restore failed with artifacts on master @ 472ea07a5232c98536293d13bb46cca59f9f2cd0:

(monitor.go:149).Wait: monitor failure: backup 1_round-trip-test-backup_cluster: mismatched fingerprints for table tpcc.customer
test artifacts and logs in: /artifacts/backup-restore/online-restore/run_1

Parameters:

  • ROACHTEST_arch=amd64
  • ROACHTEST_cloud=gce
  • ROACHTEST_coverageBuild=false
  • ROACHTEST_cpu=4
  • ROACHTEST_encrypted=true
  • ROACHTEST_fs=ext4
  • ROACHTEST_localSSD=true
  • ROACHTEST_runtimeAssertionsBuild=false
  • ROACHTEST_ssd=0
Help

See: roachtest README

See: How To Investigate (internal)

See: Grafana

Same failure on other branches

  • #132634 roachtest: backup-restore/online-restore failed [A-disaster-recovery C-test-failure O-roachtest O-robot T-disaster-recovery branch-release-24.2 release-blocker]
  • #130859 roachtest: backup-restore/online-restore failed [C-test-failure O-roachtest O-robot P-2 T-disaster-recovery branch-release-24.1.5-rc]

This test on roachdash | Improve this report!

cockroach-teamcity avatar Oct 19 '24 06:10 cockroach-teamcity

roachtest.backup-restore/online-restore failed with artifacts on master @ 472ea07a5232c98536293d13bb46cca59f9f2cd0:

  | 	github.com/cockroachdb/cockroach/pkg/sql/colexec/colexecutils/operator.go:152
  | github.com/cockroachdb/cockroach/pkg/sql/colexec/colexecbase.(*castIntStringOp).Next
  | 	github.com/cockroachdb/cockroach/bazel-out/k8-opt/bin/pkg/sql/colexec/colexecbase/cast.eg.go:8670
  | github.com/cockroachdb/cockroach/pkg/sql/colexec/colexecutils.(*vectorTypeEnforcer).Next
  | 	github.com/cockroachdb/cockroach/pkg/sql/colexec/colexecutils/operator.go:152
  | github.com/cockroachdb/cockroach/pkg/sql/colexec/colexecbase.(*castTimestampStringOp).Next
  | 	github.com/cockroachdb/cockroach/bazel-out/k8-opt/bin/pkg/sql/colexec/colexecbase/cast.eg.go:11894
  | github.com/cockroachdb/cockroach/pkg/sql/colexec/colexecutils.(*vectorTypeEnforcer).Next
  | 	github.com/cockroachdb/cockroach/pkg/sql/colexec/colexecutils/operator.go:152
  | github.com/cockroachdb/cockroach/pkg/sql/colexec/colexecbase.(*castDecimalStringOp).Next
  | 	github.com/cockroachdb/cockroach/bazel-out/k8-opt/bin/pkg/sql/colexec/colexecbase/cast.eg.go:4609
  | github.com/cockroachdb/cockroach/pkg/sql/colexec/colexecutils.(*vectorTypeEnforcer).Next
  | 	github.com/cockroachdb/cockroach/pkg/sql/colexec/colexecutils/operator.go:152
  | github.com/cockroachdb/cockroach/pkg/sql/colexec/colexecbase.(*castDecimalStringOp).Next
  | 	github.com/cockroachdb/cockroach/bazel-out/k8-opt/bin/pkg/sql/colexec/colexecbase/cast.eg.go:4609
  | github.com/cockroachdb/cockroach/pkg/sql/colexec/colexecutils.(*vectorTypeEnforcer).Next
  | 	github.com/cockroachdb/cockroach/pkg/sql/colexec/colexecutils/operator.go:152
  | github.com/cockroachdb/cockroach/pkg/sql/colexec/colexecbase.(*castDecimalStringOp).Next
  | 	github.com/cockroachdb/cockroach/bazel-out/k8-opt/bin/pkg/sql/colexec/colexecbase/cast.eg.go:4609
  | github.com/cockroachdb/cockroach/pkg/sql/colexec/colexecutils.(*vectorTypeEnforcer).Next
  | 	github.com/cockroachdb/cockroach/pkg/sql/colexec/colexecutils/operator.go:152
  | github.com/cockroachdb/cockroach/pkg/sql/colexec/colexecbase.(*castDecimalStringOp).Next
  | 	github.com/cockroachdb/cockroach/bazel-out/k8-opt/bin/pkg/sql/colexec/colexecbase/cast.eg.go:4609
  | github.com/cockroachdb/cockroach/pkg/sql/colexec/colexecutils.(*vectorTypeEnforcer).Next
  | 	github.com/cockroachdb/cockroach/pkg/sql/colexec/colexecutils/operator.go:152
  | github.com/cockroachdb/cockroach/pkg/sql/colexec/colexecbase.(*castIntStringOp).Next
  | 	github.com/cockroachdb/cockroach/bazel-out/k8-opt/bin/pkg/sql/colexec/colexecbase/cast.eg.go:8670
  | github.com/cockroachdb/cockroach/pkg/sql/colexec/colexecutils.(*vectorTypeEnforcer).Next
  | 	github.com/cockroachdb/cockroach/pkg/sql/colexec/colexecutils/operator.go:152
  | github.com/cockroachdb/cockroach/pkg/sql/colexec/colexecbase.(*castIntStringOp).Next
  | 	github.com/cockroachdb/cockroach/bazel-out/k8-opt/bin/pkg/sql/colexec/colexecbase/cast.eg.go:8670
  | github.com/cockroachdb/cockroach/pkg/sql/colexec/colexecutils.(*vectorTypeEnforcer).Next
  | 	github.com/cockroachdb/cockroach/pkg/sql/colexec/colexecutils/operator.go:152
  | github.com/cockroachdb/cockroach/pkg/sql/colexec.(*defaultBuiltinFuncOperator).Next
  | 	github.com/cockroachdb/cockroach/pkg/sql/colexec/builtin_funcs.go:41
  | github.com/cockroachdb/cockroach/pkg/sql/colexec/colexecbase.(*simpleProjectOp).Next
  | 	github.com/cockroachdb/cockroach/pkg/sql/colexec/colexecbase/simple_project.go:119
  | github.com/cockroachdb/cockroach/pkg/sql/colflow.(*batchInfoCollector).next
  | 	github.com/cockroachdb/cockroach/pkg/sql/colflow/stats.go:113
  | github.com/cockroachdb/cockroach/pkg/sql/colexecerror.CatchVectorizedRuntimeError
  | 	github.com/cockroachdb/cockroach/pkg/sql/colexecerror/error.go:147
  | github.com/cockroachdb/cockroach/pkg/sql/colflow.(*batchInfoCollector).Next
  | 	github.com/cockroachdb/cockroach/pkg/sql/colflow/stats.go:121
  | github.com/cockroachdb/cockroach/pkg/sql/colexec/colexecbase.(*fnOp).Next
  | 	github.com/cockroachdb/cockroach/pkg/sql/colexec/colexecbase/fn_op.go:27
  | github.com/cockroachdb/cockroach/pkg/sql/colexec.(*orderedAggregator).Next
  | 	github.com/cockroachdb/cockroach/pkg/sql/colexec/ordered_aggregator.go:204
Wraps: (2) non-nullable column "customer:c_balance" with no value! Index scanned was "customer_pkey" with the index key columns (c_w_id,c_d_id,c_id) and the values (4,8,2915)
Error types: (1) *withstack.withStack (2) *errutil.leafError
test artifacts and logs in: /artifacts/backup-restore/online-restore/run_1

Parameters:

  • ROACHTEST_arch=amd64
  • ROACHTEST_cloud=gce
  • ROACHTEST_coverageBuild=false
  • ROACHTEST_cpu=4
  • ROACHTEST_encrypted=false
  • ROACHTEST_fs=ext4
  • ROACHTEST_localSSD=true
  • ROACHTEST_runtimeAssertionsBuild=false
  • ROACHTEST_ssd=0
Help

See: roachtest README

See: How To Investigate (internal)

See: Grafana

Same failure on other branches

  • #132634 roachtest: backup-restore/online-restore failed [A-disaster-recovery C-test-failure O-roachtest O-robot T-disaster-recovery branch-release-24.2 release-blocker]
  • #130859 roachtest: backup-restore/online-restore failed [C-test-failure O-roachtest O-robot P-2 T-disaster-recovery branch-release-24.1.5-rc]

This test on roachdash | Improve this report!

cockroach-teamcity avatar Oct 20 '24 06:10 cockroach-teamcity

roachtest.backup-restore/online-restore failed with artifacts on master @ 472ea07a5232c98536293d13bb46cca59f9f2cd0:

  | 	github.com/cockroachdb/cockroach/pkg/sql/colexec/colexecutils/operator.go:152
  | github.com/cockroachdb/cockroach/pkg/sql/colexec/colexecbase.(*castIntStringOp).Next
  | 	github.com/cockroachdb/cockroach/bazel-out/k8-opt/bin/pkg/sql/colexec/colexecbase/cast.eg.go:8670
  | github.com/cockroachdb/cockroach/pkg/sql/colexec/colexecutils.(*vectorTypeEnforcer).Next
  | 	github.com/cockroachdb/cockroach/pkg/sql/colexec/colexecutils/operator.go:152
  | github.com/cockroachdb/cockroach/pkg/sql/colexec/colexecbase.(*castTimestampStringOp).Next
  | 	github.com/cockroachdb/cockroach/bazel-out/k8-opt/bin/pkg/sql/colexec/colexecbase/cast.eg.go:11894
  | github.com/cockroachdb/cockroach/pkg/sql/colexec/colexecutils.(*vectorTypeEnforcer).Next
  | 	github.com/cockroachdb/cockroach/pkg/sql/colexec/colexecutils/operator.go:152
  | github.com/cockroachdb/cockroach/pkg/sql/colexec/colexecbase.(*castDecimalStringOp).Next
  | 	github.com/cockroachdb/cockroach/bazel-out/k8-opt/bin/pkg/sql/colexec/colexecbase/cast.eg.go:4609
  | github.com/cockroachdb/cockroach/pkg/sql/colexec/colexecutils.(*vectorTypeEnforcer).Next
  | 	github.com/cockroachdb/cockroach/pkg/sql/colexec/colexecutils/operator.go:152
  | github.com/cockroachdb/cockroach/pkg/sql/colexec/colexecbase.(*castDecimalStringOp).Next
  | 	github.com/cockroachdb/cockroach/bazel-out/k8-opt/bin/pkg/sql/colexec/colexecbase/cast.eg.go:4609
  | github.com/cockroachdb/cockroach/pkg/sql/colexec/colexecutils.(*vectorTypeEnforcer).Next
  | 	github.com/cockroachdb/cockroach/pkg/sql/colexec/colexecutils/operator.go:152
  | github.com/cockroachdb/cockroach/pkg/sql/colexec/colexecbase.(*castDecimalStringOp).Next
  | 	github.com/cockroachdb/cockroach/bazel-out/k8-opt/bin/pkg/sql/colexec/colexecbase/cast.eg.go:4609
  | github.com/cockroachdb/cockroach/pkg/sql/colexec/colexecutils.(*vectorTypeEnforcer).Next
  | 	github.com/cockroachdb/cockroach/pkg/sql/colexec/colexecutils/operator.go:152
  | github.com/cockroachdb/cockroach/pkg/sql/colexec/colexecbase.(*castDecimalStringOp).Next
  | 	github.com/cockroachdb/cockroach/bazel-out/k8-opt/bin/pkg/sql/colexec/colexecbase/cast.eg.go:4609
  | github.com/cockroachdb/cockroach/pkg/sql/colexec/colexecutils.(*vectorTypeEnforcer).Next
  | 	github.com/cockroachdb/cockroach/pkg/sql/colexec/colexecutils/operator.go:152
  | github.com/cockroachdb/cockroach/pkg/sql/colexec/colexecbase.(*castIntStringOp).Next
  | 	github.com/cockroachdb/cockroach/bazel-out/k8-opt/bin/pkg/sql/colexec/colexecbase/cast.eg.go:8670
  | github.com/cockroachdb/cockroach/pkg/sql/colexec/colexecutils.(*vectorTypeEnforcer).Next
  | 	github.com/cockroachdb/cockroach/pkg/sql/colexec/colexecutils/operator.go:152
  | github.com/cockroachdb/cockroach/pkg/sql/colexec/colexecbase.(*castIntStringOp).Next
  | 	github.com/cockroachdb/cockroach/bazel-out/k8-opt/bin/pkg/sql/colexec/colexecbase/cast.eg.go:8670
  | github.com/cockroachdb/cockroach/pkg/sql/colexec/colexecutils.(*vectorTypeEnforcer).Next
  | 	github.com/cockroachdb/cockroach/pkg/sql/colexec/colexecutils/operator.go:152
  | github.com/cockroachdb/cockroach/pkg/sql/colexec.(*defaultBuiltinFuncOperator).Next
  | 	github.com/cockroachdb/cockroach/pkg/sql/colexec/builtin_funcs.go:41
  | github.com/cockroachdb/cockroach/pkg/sql/colexec/colexecbase.(*simpleProjectOp).Next
  | 	github.com/cockroachdb/cockroach/pkg/sql/colexec/colexecbase/simple_project.go:119
  | github.com/cockroachdb/cockroach/pkg/sql/colflow.(*batchInfoCollector).next
  | 	github.com/cockroachdb/cockroach/pkg/sql/colflow/stats.go:113
  | github.com/cockroachdb/cockroach/pkg/sql/colexecerror.CatchVectorizedRuntimeError
  | 	github.com/cockroachdb/cockroach/pkg/sql/colexecerror/error.go:147
  | github.com/cockroachdb/cockroach/pkg/sql/colflow.(*batchInfoCollector).Next
  | 	github.com/cockroachdb/cockroach/pkg/sql/colflow/stats.go:121
  | github.com/cockroachdb/cockroach/pkg/sql/colexec/colexecbase.(*fnOp).Next
  | 	github.com/cockroachdb/cockroach/pkg/sql/colexec/colexecbase/fn_op.go:27
  | github.com/cockroachdb/cockroach/pkg/sql/colexec.(*orderedAggregator).Next
  | 	github.com/cockroachdb/cockroach/pkg/sql/colexec/ordered_aggregator.go:204
Wraps: (2) non-nullable column "customer:c_balance" with no value! Index scanned was "customer_pkey" with the index key columns (c_w_id,c_d_id,c_id) and the values (0,1,33)
Error types: (1) *withstack.withStack (2) *errutil.leafError
test artifacts and logs in: /artifacts/backup-restore/online-restore/run_1

Parameters:

  • ROACHTEST_arch=amd64
  • ROACHTEST_cloud=gce
  • ROACHTEST_coverageBuild=false
  • ROACHTEST_cpu=4
  • ROACHTEST_encrypted=true
  • ROACHTEST_fs=ext4
  • ROACHTEST_localSSD=true
  • ROACHTEST_runtimeAssertionsBuild=false
  • ROACHTEST_ssd=0
Help

See: roachtest README

See: How To Investigate (internal)

See: Grafana

Same failure on other branches

  • #133018 roachtest: backup-restore/online-restore failed [A-disaster-recovery C-test-failure O-roachtest O-robot T-disaster-recovery branch-release-24.3 release-blocker]
  • #132634 roachtest: backup-restore/online-restore failed [A-disaster-recovery C-test-failure O-roachtest O-robot T-disaster-recovery branch-release-24.2 release-blocker]
  • #130859 roachtest: backup-restore/online-restore failed [C-test-failure O-roachtest O-robot P-2 T-disaster-recovery branch-release-24.1.5-rc]

This test on roachdash | Improve this report!

cockroach-teamcity avatar Oct 21 '24 06:10 cockroach-teamcity

Note: This build has runtime assertions enabled. If the same failure was hit in a run without assertions enabled, there should be a similar failure without this message. If there isn't one, then this failure is likely due to an assertion violation or (assertion) timeout.

roachtest.backup-restore/online-restore failed with artifacts on master @ eb2d2e19eb29d2747d9e267bd0612a69d066adad:

(monitor.go:149).Wait: monitor failure: backup 1_round-trip-test-backup_cluster: error verifying online restore: backup 1_round-trip-test-backup_cluster: error loading online restored contents: error querying column names for system.settings: dial tcp 34.74.78.82:26257: connect: connection refused
unexpected node event: n3: cockroach process for system interface died (exit code 134)
test artifacts and logs in: /artifacts/backup-restore/online-restore/run_1

Parameters:

  • arch=amd64
  • cloud=gce
  • coverageBuild=false
  • cpu=4
  • encrypted=false
  • fs=ext4
  • localSSD=true
  • runtimeAssertionsBuild=true
  • ssd=0
Help

See: roachtest README

See: How To Investigate (internal)

See: Grafana

Same failure on other branches

  • #135790 roachtest: backup-restore/online-restore failed [A-disaster-recovery B-runtime-assertions-enabled C-test-failure O-roachtest O-robot T-disaster-recovery branch-release-24.3.0-rc release-blocker]
  • #132634 roachtest: backup-restore/online-restore failed [A-disaster-recovery C-test-failure O-roachtest O-robot P-2 T-disaster-recovery branch-release-24.2]
  • #130859 roachtest: backup-restore/online-restore failed [C-test-failure O-roachtest O-robot P-2 T-disaster-recovery branch-release-24.1.5-rc]

This test on roachdash | Improve this report!

cockroach-teamcity avatar Nov 21 '24 07:11 cockroach-teamcity

the most recent failure is a block checksum error. Since we haven't seen the old failure in a month, I'm retitling this to be about the checksum error and bringing it back to DR and/or storage and calling the foundations/schema change flake closed unless we see it agin

dt avatar Nov 21 '24 23:11 dt

Note: This build has runtime assertions enabled. If the same failure was hit in a run without assertions enabled, there should be a similar failure without this message. If there isn't one, then this failure is likely due to an assertion violation or (assertion) timeout.

roachtest.backup-restore/online-restore failed with artifacts on master @ 5c5c9d6803d47848aa1960dd6642d5f2c1926814:

(monitor.go:149).Wait: monitor failure: backup 2_round-trip-test-backup_database-tpcc: error verifying online restore: backup 2_round-trip-test-backup_database-tpcc: error loading online restored contents: error when running query [SELECT index_name, fingerprint FROM [SHOW EXPERIMENTAL_FINGERPRINTS FROM TABLE restore_2_round_trip_test_backup_database_tpcc_2.new_order] ORDER BY index_name]: pq: internal error while retrieving user account memberships: operation "get-user-session" timed out after 10.001s (given timeout 10s): internal error while retrieving user account: get default settings error: interrupted during singleflight load-value:defaultsettings-roachprod-100-1: context deadline exceeded
unexpected node event: n3: cockroach process for system interface died (exit code 134)
test artifacts and logs in: /artifacts/backup-restore/online-restore/run_1

Parameters:

  • arch=amd64
  • cloud=gce
  • coverageBuild=false
  • cpu=4
  • encrypted=true
  • fs=ext4
  • localSSD=true
  • runtimeAssertionsBuild=true
  • ssd=0
Help

See: roachtest README

See: How To Investigate (internal)

See: Grafana

Same failure on other branches

  • #135790 roachtest: backup-restore/online-restore failed [A-disaster-recovery B-runtime-assertions-enabled C-test-failure O-roachtest O-robot T-disaster-recovery branch-release-24.3.0-rc release-blocker]
  • #132634 roachtest: backup-restore/online-restore failed [A-disaster-recovery C-test-failure O-roachtest O-robot P-2 T-disaster-recovery branch-release-24.2]
  • #130859 roachtest: backup-restore/online-restore failed [C-test-failure O-roachtest O-robot P-2 T-disaster-recovery branch-release-24.1.5-rc]

This test on roachdash | Improve this report!

cockroach-teamcity avatar Nov 22 '24 07:11 cockroach-teamcity

This issue has multiple T-eam labels. Please make sure it only has one, or else issue synchronization will not work correctly.

:owl: Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf.

blathers-crl[bot] avatar Nov 22 '24 15:11 blathers-crl[bot]

Note: This build has runtime assertions enabled. If the same failure was hit in a run without assertions enabled, there should be a similar failure without this message. If there isn't one, then this failure is likely due to an assertion violation or (assertion) timeout.

roachtest.backup-restore/online-restore failed with artifacts on master @ cea3ff5562160a3bf2802da052da2aaa40e1ccc1:

(monitor.go:149).Wait: monitor failure: backup 2_round-trip-test-backup_database-tpcc: error verifying online restore: backup 2_round-trip-test-backup_database-tpcc: error loading online restored contents: error when running query [SELECT index_name, fingerprint FROM [SHOW EXPERIMENTAL_FINGERPRINTS FROM TABLE restore_2_round_trip_test_backup_database_tpcc_2.item] ORDER BY index_name]: read tcp 172.17.0.3:45490 -> 35.237.252.62:26257: i/o timeout
unexpected node event: n3: cockroach process for system interface died (exit code 134)
test artifacts and logs in: /artifacts/backup-restore/online-restore/run_1

Parameters:

  • arch=amd64
  • cloud=gce
  • coverageBuild=false
  • cpu=4
  • encrypted=true
  • fs=ext4
  • localSSD=true
  • runtimeAssertionsBuild=true
  • ssd=0
Help

See: roachtest README

See: How To Investigate (internal)

See: Grafana

Same failure on other branches

  • #135790 roachtest: backup-restore/online-restore failed [A-disaster-recovery B-runtime-assertions-enabled C-test-failure O-roachtest O-robot T-disaster-recovery branch-release-24.3.0-rc release-blocker]
  • #132634 roachtest: backup-restore/online-restore failed [A-disaster-recovery C-test-failure O-roachtest O-robot P-2 T-disaster-recovery branch-release-24.2]
  • #130859 roachtest: backup-restore/online-restore failed [C-test-failure O-roachtest O-robot P-2 T-disaster-recovery branch-release-24.1.5-rc]

This test on roachdash | Improve this report!

cockroach-teamcity avatar Nov 23 '24 07:11 cockroach-teamcity

roachtest.backup-restore/online-restore failed with artifacts on master @ f717f6bd218121bb5e3376af658545f6bff30c22:

(monitor.go:149).Wait: monitor failure: backup 1_round-trip-test-backup_database-tpcc: error verifying online restore: backup 1_round-trip-test-backup_database-tpcc: error loading online restored contents: error when running query [SELECT index_name, fingerprint FROM [SHOW EXPERIMENTAL_FINGERPRINTS FROM TABLE restore_1_round_trip_test_backup_database_tpcc_1.warehouse] ORDER BY index_name]: pq: internal error while retrieving user account memberships: operation "get-user-session" timed out after 10.001s (given timeout 10s): internal error while retrieving user account: get auth info error: interrupted during singleflight load-value:authinfo-roachprod-8-8: context deadline exceeded
unexpected node event: n2: cockroach process for system interface died (exit code 134)
test artifacts and logs in: /artifacts/backup-restore/online-restore/run_1

Parameters:

  • arch=amd64
  • cloud=gce
  • coverageBuild=false
  • cpu=4
  • encrypted=false
  • fs=ext4
  • localSSD=true
  • runtimeAssertionsBuild=false
  • ssd=0
Help

See: roachtest README

See: How To Investigate (internal)

See: Grafana

Same failure on other branches

  • #135790 roachtest: backup-restore/online-restore failed [A-disaster-recovery B-runtime-assertions-enabled C-test-failure O-roachtest O-robot T-disaster-recovery branch-release-24.3.0-rc release-blocker]
  • #132634 roachtest: backup-restore/online-restore failed [A-disaster-recovery C-test-failure O-roachtest O-robot P-2 T-disaster-recovery branch-release-24.2]
  • #130859 roachtest: backup-restore/online-restore failed [C-test-failure O-roachtest O-robot P-2 T-disaster-recovery branch-release-24.1.5-rc]

This test on roachdash | Improve this report!

cockroach-teamcity avatar Nov 24 '24 07:11 cockroach-teamcity

roachtest.backup-restore/online-restore failed with artifacts on master @ f717f6bd218121bb5e3376af658545f6bff30c22:

(monitor.go:149).Wait: monitor failure: backup 2_round-trip-test-backup_cluster: error verifying online restore: backup 2_round-trip-test-backup_cluster: error loading online restored contents: error running command (./cockroach sql --certs-dir certs -e "SELECT * FROM [SHOW USERS]" --port {pgport:2}): COMMAND_PROBLEM: exit status 1
unexpected node event: n2: cockroach process for system interface died (exit code 134)
test artifacts and logs in: /artifacts/backup-restore/online-restore/run_1

Parameters:

  • arch=amd64
  • cloud=gce
  • coverageBuild=false
  • cpu=4
  • encrypted=true
  • fs=ext4
  • localSSD=true
  • runtimeAssertionsBuild=false
  • ssd=0
Help

See: roachtest README

See: How To Investigate (internal)

See: Grafana

Same failure on other branches

  • #135790 roachtest: backup-restore/online-restore failed [A-disaster-recovery B-runtime-assertions-enabled C-test-failure O-roachtest O-robot T-disaster-recovery branch-release-24.3.0-rc release-blocker]
  • #132634 roachtest: backup-restore/online-restore failed [A-disaster-recovery C-test-failure O-roachtest O-robot P-2 T-disaster-recovery branch-release-24.2]
  • #130859 roachtest: backup-restore/online-restore failed [C-test-failure O-roachtest O-robot P-2 T-disaster-recovery branch-release-24.1.5-rc]

This test on roachdash | Improve this report!

cockroach-teamcity avatar Nov 25 '24 06:11 cockroach-teamcity

Can confirm this bug started occurring when columnar blocks were enabled for the backup sst sink in https://github.com/cockroachdb/cockroach/commit/fa505ac51abf87bdc607cf4d6387ce5359b93fa9.

Will get fixed when https://github.com/cockroachdb/pebble/pull/4181 gets merged to Pebble and the pebble ref gets bumped in Cockroach.

itsbilal avatar Nov 25 '24 20:11 itsbilal

Reopening until the pebble bump is in.

itsbilal avatar Nov 25 '24 21:11 itsbilal

@itsbilal thanks for finding and fixing this! Do you plan to backport this to 24.3 (and .2?)?

msbutler avatar Nov 25 '24 21:11 msbutler

@msbutler we could backport it to 24.3, however it'd be near-impossible to hit it there seeing as it's hard to turn on columnar blocks there. This bug does not exist on 24.2 (no columnar blocks).

itsbilal avatar Nov 25 '24 21:11 itsbilal

Let's backport it to 24.3.

itsbilal avatar Nov 25 '24 21:11 itsbilal

Note: This build has runtime assertions enabled. If the same failure was hit in a run without assertions enabled, there should be a similar failure without this message. If there isn't one, then this failure is likely due to an assertion violation or (assertion) timeout.

roachtest.backup-restore/online-restore failed with artifacts on master @ 67caf19d3998bb3ca1ada7e3c14486d505b68012:

(monitor.go:149).Wait: monitor failure: full command output in run_072443.565297314_n4_COCKROACHRANDOMSEED4.log: COMMAND_PROBLEM: exit status 1
test artifacts and logs in: /artifacts/backup-restore/online-restore/run_1

Parameters:

  • arch=amd64
  • cloud=gce
  • coverageBuild=false
  • cpu=4
  • encrypted=true
  • fs=ext4
  • localSSD=true
  • runtimeAssertionsBuild=true
  • ssd=0
Help

See: roachtest README

See: How To Investigate (internal)

See: Grafana

Same failure on other branches

  • #132634 roachtest: backup-restore/online-restore failed [A-disaster-recovery C-test-failure O-roachtest O-robot P-2 T-disaster-recovery branch-release-24.2]
  • #130859 roachtest: backup-restore/online-restore failed [C-test-failure O-roachtest O-robot P-2 T-disaster-recovery branch-release-24.1.5-rc]

This test on roachdash | Improve this report!

cockroach-teamcity avatar Nov 26 '24 07:11 cockroach-teamcity