Multiple databases - logging duplicated across both controllers
If you manage multiple databases with SchemaHero, a separate controller pod for each is created. Some log entries which I expect to only show up in the managing container logs show up in both.
The cleanest way to reproduce this is as follows:
- create an empty mysql db somewhere called db1
- deploy a database in k8s managing db1, verify function by making a sample table
- create an empty mysql db somewhere called db2
- deploy a database in k8s managing db2, verify function by making a sample table
- live tail the logs for both db1-controller-0 and db2-controller-0 in separate terminals
- deploy a table to db2, notice duplicate log entries in both db1-controller-0 and db2-controller-0 log
In my testing I used the above setup and here is my step 6 table manifest:
apiVersion: schemas.schemahero.io/v1alpha4
kind: Table
metadata:
name: demotable
spec:
database: db2
name: demotable
schema:
mysql:
primaryKey:
- name
columns:
- name: name
type: varchar(20)
constraints:
notNull: true
- name: user_id
type: varchar(20)
constraints:
notNull: true
The log entry I see in db2-controller-0 are expected:
Executing query "create table `demotable` (`name` varchar (20) not null, `user_id` varchar (20) not null, primary key (`name`))"
The log entries I see in db1-controller-0 are unexpected:
Executing query "create table `demotable` (`name` varchar (20) not null, `user_id` varchar (20) not null, primary key (`name`))"
{"level":"error","msg":"Error 1050: Table 'demotable' already exists"}
I'm unclear on why the entry Table 'demotable' already exists shows up. Concerned that db1-controller-0 might also be attempting to manage the schema for db2. Haven't run that theory down yet.
For completeness here are my DB manifests:
db1 (step 2 above):
apiVersion: databases.schemahero.io/v1alpha4
kind: Database
metadata:
name: db1
namespace: default
spec:
schemahero:
immediateDeploy: true
connection:
mysql:
uri:
value: root:passworddb1@tcp(mysql-db1:3306)/db1?tls=false
db2 (step 4 above):
apiVersion: databases.schemahero.io/v1alpha4
kind: Database
metadata:
name: db2
namespace: default
spec:
schemahero:
immediateDeploy: true
connection:
mysql:
uri:
value: root:passworddb2@tcp(mysql-db2:3306)/db2?tls=false
Please let me know if I can supply further information or test anything specific.
I was able to confirm that all database controller pods are indeed trying to manage a single database, rather than a single controller pod per database with isolation. Findings below.
Currently have 3 databases defined and running 3 controller pods (IP mappings):
db1-controller-0 1/1 Running 0 77m 10.32.0.44 kyler-support-schemahero <none> <none>
db2-controller-0 1/1 Running 0 74m 10.32.0.46 kyler-support-schemahero <none> <none>
db3-controller-0 1/1 Running 0 53m 10.32.0.48 kyler-support-schemahero <none> <none>
Attempt to create a table on db2 using the following manifest:
apiVersion: schemas.schemahero.io/v1alpha4
kind: Table
metadata:
name: demotable4
spec:
database: db2
name: demotable4
schema:
mysql:
primaryKey:
- name
columns:
- name: name
type: varchar(20)
constraints:
notNull: true
- name: user_id
type: varchar(20)
constraints:
notNull: true
MySQL logging on db2 shows all 3 controller pods logging into db2 and attempting to create the table:
2021-08-28T06:35:43.509917Z 523 Connect [email protected] on db2 using TCP/IP
2021-08-28T06:35:43.510289Z 523 Quit
2021-08-28T06:35:43.511640Z 524 Connect [email protected] on db2 using TCP/IP
2021-08-28T06:35:43.512786Z 524 Prepare select count(1) from information_schema.TABLES where TABLE_NAME = ? and TABLE_SCHEMA = ?
2021-08-28T06:35:43.512868Z 524 Execute select count(1) from information_schema.TABLES where TABLE_NAME = 'demotable4' and TABLE_SCHEMA = 'db2'
2021-08-28T06:35:43.513542Z 524 Close stmt
2021-08-28T06:35:43.513595Z 524 Quit
2021-08-28T06:35:43.525160Z 525 Connect [email protected] on db2 using TCP/IP
2021-08-28T06:35:43.526027Z 525 Quit
2021-08-28T06:35:43.526456Z 526 Connect [email protected] on db2 using TCP/IP
2021-08-28T06:35:43.526567Z 527 Connect [email protected] on db2 using TCP/IP
2021-08-28T06:35:43.526860Z 526 Quit
2021-08-28T06:35:43.527304Z 527 Quit
2021-08-28T06:35:43.527570Z 528 Connect [email protected] on db2 using TCP/IP
2021-08-28T06:35:43.527974Z 528 Query create table `demotable4` (`name` varchar (20) not null, `user_id` varchar (20) not null, primary key (`name`))
2021-08-28T06:35:43.528590Z 529 Connect [email protected] on db2 using TCP/IP
2021-08-28T06:35:43.528866Z 529 Query create table `demotable4` (`name` varchar (20) not null, `user_id` varchar (20) not null, primary key (`name`))
2021-08-28T06:35:43.528938Z 530 Connect [email protected] on db2 using TCP/IP
2021-08-28T06:35:43.529243Z 530 Query create table `demotable4` (`name` varchar (20) not null, `user_id` varchar (20) not null, primary key (`name`))
2021-08-28T06:35:43.604828Z 528 Quit
2021-08-28T06:35:43.606127Z 529 Quit
2021-08-28T06:35:43.606183Z 530 Quit
Databases are in same k8s cluster and SchemaHero controller pods connect over ClusterIP. Notice connections from all 3 controller pods as well as create table statements.
This seem to only affect setups where different databases use the same table names and resources are all in the same namespace. A workaround is to place resources for each database into its own namespace .
This issue is still being reported: https://kubernetes.slack.com/archives/CS8RSG6P3/p1632785841056400?thread_ts=1630932255.010200&cid=CS8RSG6P3
I'm trying to manually reproduce this so that I can attempt to fix and write a test around this, but so far haven't been successful. Here's my setup:
- Create a Kubernetes cluster on DigitalOcean (v1.21.3)
- Create two separate, managed, Postgres 13 instances on DigitalOcean (names: schemahero1, schemahero2)
Installed the krew schemahero plugin (v0.12.4)
Next, I created a single SchemaHero Database type for each database:
apiVersion: databases.schemahero.io/v1alpha4
kind: Database
metadata:
name: schemahero1
spec:
connection:
postgres:
uri:
value: postgres://schemahero1:<redacted>@schemahero1-do-user-898281-0.b.db.ondigitalocean.com:25060/schemahero1?sslmode=require
apiVersion: databases.schemahero.io/v1alpha4
kind: Database
metadata:
name: schemahero2
spec:
connection:
postgres:
uri:
value: postgres://schemahero2:<redacted>@schemahero2-do-user-898281-0.b.db.ondigitalocean.com: 5432/schemahero1?sslmode=require
(Note that the SchemaHero2 port is intentionally incorrect in this connection string. SchemaHero doesn't log much when things are going well, so I want to see a connection failed message, as the slack report was mentioning)
Confirmed that these two controllers have started:
$ kubectl logs -f schemahero1-controller-0
{"level":"info","msg":"Starting schemahero version [{Version:v0.12.4 GitSHA:0773cf5 BuildTime:2021-07-22 17:51:04 +0000 UTC TimeFallback:}]"}
$ kubectl logs -f schemahero2-controller-0
{"level":"info","msg":"Starting schemahero version [{Version:v0.12.4 GitSHA:0773cf5 BuildTime:2021-07-22 17:51:04 +0000 UTC TimeFallback:}]"}
Finally, I create a table targeting the schemahero2 database:
apiVersion: schemas.schemahero.io/v1alpha4
kind: Table
metadata:
name: airport2
spec:
database: schemahero2
name: airport2
schema:
postgres:
primaryKey: [code]
columns:
- name: code
type: char(4)
- name: name
type: varchar(255)
And the apply it.
The output of schemahero1's controller is silent (nothing). The output of schemahero2's controller is correctly complaining that the connection cannot be established:
kubectl logs -f schemahero2-controller-0
{"level":"info","msg":"Starting schemahero version [{Version:v0.12.4 GitSHA:0773cf5 BuildTime:2021-07-22 17:51:04 +0000 UTC TimeFallback:}]"}
{"level":"error","msg":"failed to plan sync: failed to connect to postgres: failed to connect to postgres: failed to connect to `host=schemahero2-do-user-898281-0.b.db.ondigitalocean.com user=schemahero2 database=schemahero1`: dial error (dial tcp 138.68.238.126:5432: connect: connection timed out)"}
This is still an open issue, I need to find a different approach to reproducing it.