kitchen_sync icon indicating copy to clipboard operation
kitchen_sync copied to clipboard

MySQL UTF8 encoding issues with table-specific encodings/collations

Open javier-sanz opened this issue 5 years ago • 5 comments

So syncing to tables between 2 MySQL databases I found the next error

Error in the 'to' worker: Incorrect string value: '\xF0\x9F\x91\x8D' for column 'msg' at row 2647

The column is created like this by KS

CREATE TABLE `orders` (
...
  `msg` mediumtext,
...
)

But comparing DLL I have this on the original table

CREATE TABLE `orders` (
...
`msg` mediumtext COLLATE utf8mb4_unicode_ci,
...
) ENGINE=InnoDB  DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci

but on the copy

CREATE TABLE `orders` (
...
`msg` mediumtext mediumtext COLLATE utf8_unicode_ci,
...
) ENGINE=InnoDB  DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci

I guess that triggers this error.

javier-sanz avatar Sep 30 '20 23:09 javier-sanz

Unfortunately KS doesn't know about charsets and collation orders yet. It's actually pretty easy to add but it would be nonportable between databases which is awkward.

Because most people use the same charsets and collation orders for every table in the database, in practice it's usually enough to set it at the database level.

So I suggest setting the default charset when you create the database (you'll probably find the default collation for that charset is right). You can do this using mysqladmin create with the --default-character-set option, or if it's already created you can use ALTER DATABASE.

willbryant avatar Oct 01 '20 00:10 willbryant

Unfortunately our default character-set an collation for the db is utf8 and utf8_unicode_cirespectively. This table is the one is different. On the other hand if I create manually our DB with our flyway scripts the index names are not deterministic and KS detects de differences So it tries to drop index and to create a new one with a new name. Unfortunately that cannot be done because some are associated with foreign keys and it gives me an error. So right now I either rename manually the index or nullify those fields depending if I create manually the DB or I let KS to do it 😄 .

Again thanks for you fast reply.

javier-sanz avatar Oct 01 '20 08:10 javier-sanz

Hmm yeah that's a tricky one. I'll have a think about how I could add portable character set encodings. Probably doable with some options to ignore them when necessary.

Is that table with the different character set also the one with the random (!) index names? Otherwise you could at least mysqldump -d to get one table definition and load that in before running KS, so you can get on with seeing if KS works for you otherwise.

willbryant avatar Oct 02 '20 22:10 willbryant

I've fixed the index thing first anyway (see #94 - released in v2.7).

willbryant avatar Oct 04 '20 10:10 willbryant

Thanks for this. I can confirm that it did work. I compiled the last version and using a destiny database created with flyway there was not indexes renaming problem. Regarding the indexes having different names those ones are the ones created by MySQL associated to the foreign key, so they were never properly named on the creation scripts. The original DB was migrated from version 5.6 to 5.7 not long time ago. This could explain it but I have not had the time to confirm it.

javier-sanz avatar Oct 08 '20 15:10 javier-sanz