Allow use gh-ost on Galera / XtraDB cluster
Related issue: https://github.com/github/gh-ost/issues/224
Description
Currently gh-ost can't work on Galera-based MySQL like PXC because it needs to perform LOCK TABLES for both atomic and two-step cut-over phase.
A know workaround is to use two-step cut-over strategy but only with two not so common requirements:
- All the write queries must be sent to one single node in the cluster
- It's possible to have missing table for some time during the RENAME steps.
In order to use gh-ost without the above requirements, on all Galera-based systems I created a new cut-over strategy named trigger
Using this option, during the cut-over phase, gh-ost will behave exactly like the well known tool pt-online-schema-change: adding triggers on the original table to keep the ghost one synchronised.
How it works
In order to have an atomic table swap without the use of a LOCK TABLES, three triggers are created during the cut-over phase, like what pt-online-schema-change does, and a special event is injected in the binlog to handle the write delegation phase from gh-ost to MySQL with triggers.
gh-ost needs to handle that write delegation phase carefully because adding the triggers and stopping applying writes from gh-ost can't be done in the exact same moment or atomically.
After this delegation phase is necessary to sanitise all the write events occurred from the start to the end of that phase.
For manage safely this scenario the trigger cut-over strategy does this steps:
- First gh-ost inject a stop writes event in the binlog and gh-ost disable the writes once it receive it.
- Triggers are created to handle the modifications on the current table to the ghost one.
- A created triggers event is injected in the binlog to notify gh-ost that triggers are in place.
- The affected rows between the write delegation swap can be in an inconsistent state, so this events are sanitised by removing and adding again the values from the original table to the ghost one, that will be in sync.
- After this, and in the case that some error take place, the triggers are removed.
- Finally the tables are swapped.
Triggers
Even though gh-ost philosophy is avoid the use of triggers in Galera-based MySQL using this new cut-over gh-ost is able to archive a zero-error table swapping while supporting Galera-based MySQL with writes on multiple nodes, like in a common setup. Given the cut-over phase should take very short time, it shouldn't be a problem to add triggers only in this phase.
@shlomi-noach I know you don't have a lot of free time lately but, could you at least look into the look at the approach and the possibility of adding a trigger-based cut-over? We are using it in production for a couple of months and it is showing some good results.
If you have any questions, don't hesitate to let me know.
@jfudally I don't know if you can tell me how feasible it would be to integrate this new cutover for galera / XtraDB
In case it helps, I created a post some time ago explaining this particular case https://gonlo2.github.io/blog/en/blog/2020-03-04-why-dont-you-see-ghosts-in-galeras/
Unfortunately, I don't know enough about Galera to make a decision on this PR.
/cc @timvaillancourt @gtowey maybe?