schemahero Multiple databases - logging duplicated across both controllers

If you manage multiple databases with SchemaHero, a separate controller pod for each is created. Some log entries which I expect to only show up in the managing container logs show up in both.

The cleanest way to reproduce this is as follows:

create an empty mysql db somewhere called db1
deploy a database in k8s managing db1, verify function by making a sample table
create an empty mysql db somewhere called db2
deploy a database in k8s managing db2, verify function by making a sample table
live tail the logs for both db1-controller-0 and db2-controller-0 in separate terminals
deploy a table to db2, notice duplicate log entries in both db1-controller-0 and db2-controller-0 log

In my testing I used the above setup and here is my step 6 table manifest:

apiVersion: schemas.schemahero.io/v1alpha4
kind: Table
metadata:
  name: demotable
spec:
  database: db2
  name: demotable
  schema:
    mysql:
      primaryKey:
      - name
      columns:
      - name: name
        type: varchar(20)
        constraints:
          notNull: true
      - name: user_id
        type: varchar(20)
        constraints:
          notNull: true

The log entry I see in db2-controller-0 are expected:

Executing query "create table `demotable` (`name` varchar (20) not null, `user_id` varchar (20) not null, primary key (`name`))"

The log entries I see in db1-controller-0 are unexpected:

Executing query "create table `demotable` (`name` varchar (20) not null, `user_id` varchar (20) not null, primary key (`name`))"
{"level":"error","msg":"Error 1050: Table 'demotable' already exists"}

I'm unclear on why the entry Table 'demotable' already exists shows up. Concerned that db1-controller-0 might also be attempting to manage the schema for db2. Haven't run that theory down yet.

For completeness here are my DB manifests:

db1 (step 2 above):

apiVersion: databases.schemahero.io/v1alpha4
kind: Database
metadata:
  name: db1
  namespace: default
spec:
  schemahero:
  immediateDeploy: true
  connection:
    mysql:
      uri:
        value: root:passworddb1@tcp(mysql-db1:3306)/db1?tls=false

db2 (step 4 above):

apiVersion: databases.schemahero.io/v1alpha4
kind: Database
metadata:
  name: db2
  namespace: default
spec:
  schemahero:
  immediateDeploy: true
  connection:
    mysql:
      uri:
        value: root:passworddb2@tcp(mysql-db2:3306)/db2?tls=false

Please let me know if I can supply further information or test anything specific.

Aug 28 '21 06:08 kylerisse

I was able to confirm that all database controller pods are indeed trying to manage a single database, rather than a single controller pod per database with isolation. Findings below.

Currently have 3 databases defined and running 3 controller pods (IP mappings):

db1-controller-0             1/1     Running   0          77m   10.32.0.44   kyler-support-schemahero   <none>           <none>
db2-controller-0             1/1     Running   0          74m   10.32.0.46   kyler-support-schemahero   <none>           <none>
db3-controller-0             1/1     Running   0          53m   10.32.0.48   kyler-support-schemahero   <none>           <none>

Attempt to create a table on db2 using the following manifest:

apiVersion: schemas.schemahero.io/v1alpha4
kind: Table
metadata:
  name: demotable4
spec:
  database: db2
  name: demotable4
  schema:
    mysql:
      primaryKey:
      - name
      columns:
      - name: name
        type: varchar(20)
        constraints:
          notNull: true
      - name: user_id
        type: varchar(20)
        constraints:
          notNull: true

MySQL logging on db2 shows all 3 controller pods logging into db2 and attempting to create the table:

2021-08-28T06:35:43.509917Z	  523 Connect	[email protected] on db2 using TCP/IP
2021-08-28T06:35:43.510289Z	  523 Quit
2021-08-28T06:35:43.511640Z	  524 Connect	[email protected] on db2 using TCP/IP
2021-08-28T06:35:43.512786Z	  524 Prepare	select count(1) from information_schema.TABLES where TABLE_NAME = ? and TABLE_SCHEMA = ?
2021-08-28T06:35:43.512868Z	  524 Execute	select count(1) from information_schema.TABLES where TABLE_NAME = 'demotable4' and TABLE_SCHEMA = 'db2'
2021-08-28T06:35:43.513542Z	  524 Close stmt
2021-08-28T06:35:43.513595Z	  524 Quit
2021-08-28T06:35:43.525160Z	  525 Connect	[email protected] on db2 using TCP/IP
2021-08-28T06:35:43.526027Z	  525 Quit
2021-08-28T06:35:43.526456Z	  526 Connect	[email protected] on db2 using TCP/IP
2021-08-28T06:35:43.526567Z	  527 Connect	[email protected] on db2 using TCP/IP
2021-08-28T06:35:43.526860Z	  526 Quit
2021-08-28T06:35:43.527304Z	  527 Quit
2021-08-28T06:35:43.527570Z	  528 Connect	[email protected] on db2 using TCP/IP
2021-08-28T06:35:43.527974Z	  528 Query	create table `demotable4` (`name` varchar (20) not null, `user_id` varchar (20) not null, primary key (`name`))
2021-08-28T06:35:43.528590Z	  529 Connect	[email protected] on db2 using TCP/IP
2021-08-28T06:35:43.528866Z	  529 Query	create table `demotable4` (`name` varchar (20) not null, `user_id` varchar (20) not null, primary key (`name`))
2021-08-28T06:35:43.528938Z	  530 Connect	[email protected] on db2 using TCP/IP
2021-08-28T06:35:43.529243Z	  530 Query	create table `demotable4` (`name` varchar (20) not null, `user_id` varchar (20) not null, primary key (`name`))
2021-08-28T06:35:43.604828Z	  528 Quit
2021-08-28T06:35:43.606127Z	  529 Quit
2021-08-28T06:35:43.606183Z	  530 Quit

Databases are in same k8s cluster and SchemaHero controller pods connect over ClusterIP. Notice connections from all 3 controller pods as well as create table statements.

Aug 28 '21 06:08 kylerisse

This seem to only affect setups where different databases use the same table names and resources are all in the same namespace. A workaround is to place resources for each database into its own namespace .

Sep 07 '21 18:09 divolgin

This issue is still being reported: https://kubernetes.slack.com/archives/CS8RSG6P3/p1632785841056400?thread_ts=1630932255.010200&cid=CS8RSG6P3

Sep 29 '21 12:09 marccampbell

I'm trying to manually reproduce this so that I can attempt to fix and write a test around this, but so far haven't been successful. Here's my setup:

Create a Kubernetes cluster on DigitalOcean (v1.21.3)
Create two separate, managed, Postgres 13 instances on DigitalOcean (names: schemahero1, schemahero2)

Installed the krew schemahero plugin (v0.12.4)

Next, I created a single SchemaHero Database type for each database:

apiVersion: databases.schemahero.io/v1alpha4
kind: Database
metadata:
  name: schemahero1
spec:
  connection:
    postgres:
      uri:
        value: postgres://schemahero1:<redacted>@schemahero1-do-user-898281-0.b.db.ondigitalocean.com:25060/schemahero1?sslmode=require

apiVersion: databases.schemahero.io/v1alpha4
kind: Database
metadata:
  name: schemahero2
spec:
  connection:
    postgres:
      uri:
        value: postgres://schemahero2:<redacted>@schemahero2-do-user-898281-0.b.db.ondigitalocean.com: 5432/schemahero1?sslmode=require

(Note that the SchemaHero2 port is intentionally incorrect in this connection string. SchemaHero doesn't log much when things are going well, so I want to see a connection failed message, as the slack report was mentioning)

Confirmed that these two controllers have started:

$ kubectl logs -f schemahero1-controller-0
{"level":"info","msg":"Starting schemahero version [{Version:v0.12.4 GitSHA:0773cf5 BuildTime:2021-07-22 17:51:04 +0000 UTC TimeFallback:}]"}

$ kubectl logs -f schemahero2-controller-0
{"level":"info","msg":"Starting schemahero version [{Version:v0.12.4 GitSHA:0773cf5 BuildTime:2021-07-22 17:51:04 +0000 UTC TimeFallback:}]"}

Finally, I create a table targeting the schemahero2 database:

apiVersion: schemas.schemahero.io/v1alpha4
kind: Table
metadata:
  name: airport2
spec:
  database: schemahero2
  name: airport2
  schema:
    postgres:
      primaryKey: [code]
      columns:
        - name: code
          type: char(4)
        - name: name
          type: varchar(255)

And the apply it.

The output of schemahero1's controller is silent (nothing). The output of schemahero2's controller is correctly complaining that the connection cannot be established:

kubectl logs -f schemahero2-controller-0
{"level":"info","msg":"Starting schemahero version [{Version:v0.12.4 GitSHA:0773cf5 BuildTime:2021-07-22 17:51:04 +0000 UTC TimeFallback:}]"}
{"level":"error","msg":"failed to plan sync: failed to connect to postgres: failed to connect to postgres: failed to connect to `host=schemahero2-do-user-898281-0.b.db.ondigitalocean.com user=schemahero2 database=schemahero1`: dial error (dial tcp 138.68.238.126:5432: connect: connection timed out)"}

This is still an open issue, I need to find a different approach to reproducing it.

Sep 29 '21 12:09 marccampbell