Duplicate data with ssh tunnel and kafka with upgrades
What version of Materialize are you using?
77275f5565
What is the issue?
Seen in https://buildkite.com/materialize/nightlies/builds/6791#018e14af-0376-4c54-9326-28963743bf81, but seems unrelated to the PR
> SELECT * FROM alter_connection_source_2b;
rows didn't match; sleeping to see if dataflow catches up 50ms 75ms 113ms 169ms 253ms 380ms 570ms 854ms 1s 2s 3s 4s 6s 10s 15s 22s 33s 49s 74s 78s
^^^ +++
14:1: error: non-matching rows: expected:
[["fourty"], ["ten"], ["thirty"], ["twenty"]]
got:
[["fourty"], ["ten"], ["thirty"], ["thirty"], ["twenty"]]
Poor diff:
+ thirty
|
13 |
14 | > SELECT * FROM alter_connection_source_2b;
| ^
The thirty is only ingested once in the AlterConnectionToNonSsh scenario, so this is scary. I'll try reproducing it with while true; do bin/mzcompose --find platform-checks down && bin/mzcompose --find platform-checks run default --scenario=UpgradeEntireMzFourVersions --check=AlterConnectionToNonSsh || break; done
Eek! This is really scary. Any help reproing this would be great. Adding to "under discussion" on the storage team board.
I couldn't repro this locally, but it happened again in CI: https://buildkite.com/materialize/nightlies/builds/6803#018e18e5-e2ce-4bc0-b42c-b483e8180d59
Very similar issue in the same job in https://buildkite.com/materialize/nightlies/builds/6860#018e326e-6f0f-4561-a6b8-4feb05d8efe5:
8:1: error: non-matching rows: expected:
[["four"], ["one"], ["three"], ["two"]]
got:
[["four"], ["one"], ["three"], ["two"], ["two"]]
Poor diff:
+ two
|
7 |
8 | > SELECT * FROM alter_connection_source_3a;
| ^
+++ !!! Error Report