BigQuery: adding a column with a list of strings/... leads to an incorrect ALTER TABLE statement
The background is this code:
- https://github.com/PeerDB-io/peerdb/blob/main/flow/connectors/bigquery/bigquery.go#L236-L239
- https://github.com/PeerDB-io/peerdb/blob/main/flow/connectors/bigquery/qvalue_convert.go#L41-L43
Basically, the ALTER TABLE statement is missing the array type if the new column is a list - it will create a new column of the base type, i.e.: if you add a column of type list of strings, then peerdb will create a column of type string in BigQuery. This leads to avro errors down the line which look like this:
[...]
failed to sync records:
failed to push to avro stage:
failed to write records to local Avro file:
failed to write records to temporary Avro file:
failed to write records to OCF writer:
failed to write record to OCF:
cannot translate datum to binary:
[...]
cannot encode binary record "<affected table name>" field "<affected column name>":
value does not match its schema:
cannot encode binary union:
no member schema types support datum:
allowed types: [null string]; received: map[string]interface {}
Note: don't be deceived by allowed types: [null string]; received: map[string]interface {} - this is a very misleading error message. The value (and not the type) is important here, which is map[array:[]] and this value is correct: we want to write an empty list. But the avro only accepts nullable strings here since the BigQuery schema is incorrect.
@Amogh-Bharadwaj probably something that'd be fixed as part of #1679