To do proper switchover in cluster
hello @paunin, manual switchover is currently not working, maybe because SSH is not working at all:
First I tried to run the command in a standby container:
lars@laptop:~/Downloads/PostDock$ docker -D exec postdock_pgsecondary2_1 su -c "repmgr standby switchover" - postgres
NOTICE: executing switchover on node "node3" (ID: 3)
WARNING: unable to connect to remote host "pgmaster" via SSH
ERROR: unable to connect via SSH to host "pgmaster", user ""
DEBU[0000] [hijack] End of stdout
Then i tried to debug it interactively in the container of my master container:
lars@laptop:~/Downloads/PostDock$ docker exec -ti postdock_pgmaster_1 bash
root@82cdc571e72b:/# ssh pgsecondary1
ssh: connect to host pgsecondary1 port 22: Connection refused
I generated new ssh-keys and enabled SSH in my docker-compose file:
version: '2'
services:
pgmaster:
restart: always
image: registry.iznet/lk/hapg:ssh
environment:
SSH_ENABLE: 1
NODE_ID: 1 # Integer number of node (not required if can be extracted from NODE_NAME var, e.g. node-45 => 1045)
NODE_NAME: node1 # Node name
CLUSTER_NODE_NETWORK_NAME: pgmaster # (default: hostname of the node)
PARTNER_NODES: "pgmaster,pgsecondary1,pgsecondary2"
REPLICATION_PRIMARY_HOST: pgmaster
POSTGRES_PASSWORD: monkey_pass
POSTGRES_USER: monkey_user
POSTGRES_DB: monkey_db
CLEAN_OVER_REWIND: 1
MASTER_RESPONSE_TIMEOUT: 5
RECONNECT_ATTEMPTS: 1
CONFIGS: "listen_addresses:'*'"
# in format variable1:value1[,variable2:value2[,...]] if CONFIGS_DELIMITER_SYMBOL=, and CONFIGS_ASSIGNMENT_SYMBOL=:
# used for pgpool.conf file
#defaults:
CLUSTER_NAME: pg_cluster # default is pg_cluster
REPLICATION_DB: replication_db # default is replication_db
REPLICATION_USER: replication_user # default is replication_user
REPLICATION_PASSWORD: replication_pass # default is replication_pass
volumes:
#- ./ssh:/var/lib/postgres/.ssh
- ./ssh/:/home/postgres/.ssh/keys
pgsecondary1:
restart: always
image: registry.iznet/lk/hapg:ssh
environment:
SSH_ENABLE: 1
PARTNER_NODES: "pgmaster,pgsecondary1,pgsecondary2"
REPLICATION_PRIMARY_HOST: pgmaster
NODE_ID: 2
NODE_NAME: node2
CLUSTER_NODE_NETWORK_NAME: pgsecondary1
CLEAN_OVER_REWIND: 1
RECONNECT_ATTEMPTS: 1
MASTER_RESPONSE_TIMEOUT: 5
volumes:
- ./ssh/:/home/postgres/.ssh/keys
pgsecondary2:
restart: always
image: registry.iznet/lk/hapg:ssh
environment:
SSH_ENABLE: 1
PARTNER_NODES: "pgmaster,pgsecondary1,pgsecondary2"
REPLICATION_PRIMARY_HOST: pgmaster
NODE_ID: 3
NODE_NAME: node3
CLUSTER_NODE_NETWORK_NAME: pgsecondary2
CLEAN_OVER_REWIND: 1
RECONNECT_ATTEMPTS: 1
MASTER_RESPONSE_TIMEOUT: 5
volumes:
- ./ssh/:/home/postgres/.ssh/keys
I recovered SSH(see the test) but think it has more issues apart of SSH
Hi Paunin,
We are trying to switchover a standby, and as per the repmgr instructions the repmgrd daemon is not supposed to run during this switchover.
However, when I try to shutdown the repmgrd the pod is crashing and in a loop.
Can you please let us know how to accomplish this?
Thanks, Bhuvan.