Correct way to upgrade operator from 1.23.7 to 1.24+ with multiple clickhouse keeper replicas
Hello I am following https://github.com/Altinity/clickhouse-operator/blob/0.24.0/docs/keeper_migration_from_23_to_24.md instruction to upgrade operator from 1.23.7 to 1.24.5
Unfortunately I had encountered an issue during the upgrade of installation with 3 replicas of clickhouse keepers:
While with old operator version all keepers located in the single statefulset which has a single service; with 1.24.5 a separate statefulset and service is created for each replica. Which basically changes the addresses of all keeper replicas.
For example in my installation for version 1.23.7 I had 3 keeper addresses:
clickhouse-keeper-logging-0.clickhouse-keeper-logging-headless
clickhouse-keeper-logging-1.clickhouse-keeper-logging-headless
clickhouse-keeper-logging-2.clickhouse-keeper-logging-headless
And after the update they changed to:
chk-clickhouse-keeper-logging-default-0-0
chk-clickhouse-keeper-logging-default-1-0
chk-clickhouse-keeper-logging-default-2-0
I saw that <raft_configuration> was indeed changed to a correct one after an update:
<raft_configuration>
<server>
<id>0</id>
<hostname>chk-clickhouse-keeper-logging-default-0-0</hostname>
<port>9444</port>
</server>
<server>
<id>1</id>
<hostname>chk-clickhouse-keeper-logging-default-0-1</hostname>
<port>9444</port>
</server>
<server>
<id>2</id>
<hostname>chk-clickhouse-keeper-logging-default-0-2</hostname>
<port>9444</port>
</server>
</raft_configuration>
However even having the updated raft_configuration, keeper replicas were unable to reach each other, as hosts stored in replication log were not updated in the process and replicas continued using old addresses
clickhouse-keeper-logging-1:/$ clickhouse-keeper client -h localhost --port 2181 -q "get '/keeper/config'"
server.0=clickhouse-keeper-logging-0.clickhouse-keeper-logging-headless:9444;participant;1
server.2=clickhouse-keeper-logging-2.clickhouse-keeper-logging-headless:9444;participant;1
server.1=clickhouse-keeper-logging-1.clickhouse-keeper-logging-headless:9444;participant;1
I found no easy way to make clickhouse keeper reload host configuration from disk or similar issues, but maybe I am missing something here?
What I tried:
-
Configuring the cluster for keeper hosts to have the same addresses as before I failed to come to any solution without keeping and maintaining additional k8s objects which is not desirable, as configuration had changed and we now have 3 services for 3 keeper replicas instead of one.
-
Adding new hosts using zookeeper reconfigure command Doesnt seem possible to incrementally change
/keeper/configto the state matchingraft_configuration, as we can not reuse server ids -
Starting clickhouse keeper with
--force-recoveryflag for a time With that flag clickhouse keeper loads configuration fromraft_configurationinto/keeper/config, however that presumably damages replication log in some cases in my upgrade process and leads to replication failures shortly after upgrade, I am still investigating why am I having issues with this approach.
I am in the process of upgrading too. My keeper installation name is clickhouse-keeper and I see it created a service (along with individual services for each Keeper pod) called keeper-clickhouse-keeper.
In my ClickHouseInstallation yaml I am referring the host as
zookeeper:
nodes:
- host: keeper-clickhouse-keeper
And it works. I tried creating a table and the definition was available on each server pods.
I just wanted to share what I did, hopefully this is correct way to access Keeper.
@Slach is it correct to do so?
@shahsiddharth08 Thank you for your reply I use this setting in ClickHouseInstallation too, and I also do not have connection problems between clickhouse server and keeper.
However I do have problem with keeper replicas connecting to each other after update. How many keeper replicas do you have in your installation?
However I do have problem with keeper replicas connecting to each other after update.
What do you mean connecting each other? Like ping?
I have 5 replicas.
I finally found a way to update my cluster with almost none read downtime and about 15 mins write downtime (for my case). The key was making sure there are no connections going between clcikhouse servers and keeper during the update and always using --force-recovery only on keeper leaders. In the update process I copied data from old keeper to a new one instead of reattaching PVs to PVCs as was recommended in operator upgrade instruction, as working with PV was not allowed in my cluster and copying the data made sure I always have an unspoiled backup of replication log. In the end I manually patched server configmaps with a new keeper address and restarted servers to reduce write downtime at a price of short read downtime.
I am attaching the script I used to update my cluster in rather raw form, but I believe it might be useful to someone encountered the same problem update-script.zip
Meanwhile I am still curious if there was an easier way to do this
@pekashy thank you so much for sharing your experience