[bitnami/etcd] Fix unbound host variable when disaster recovery is enabled
Description of the change
When you start a pod at least after 3.5.4-debian-11-r14 disaster recovery enabled, the init script fails because of an unbound variable host.
This adds the default port as a variable and makes sure that it uses hostname -f to get the host of the current container/pod.
Benefits
The server can start again
Possible drawbacks
Not sure of any
Applicable issues
Additional information
Hi @jaysonsantos,
Could you do a rebase of the main?
@Mauraza done
Hi @jaysonsantos,
Could you add to this thread a way to reproduce the issue and the logs related to when the pod fails?
Hi @Mauraza, that was tricky to simulate, but you can see the behavior below. It mimics a state where the local data is broken and the current member has to restore the data from a snapshot.
mkdir -p etcd/{snapshots,data} && echo does not matter | tee etcd/{data/member_id,snapshots/.disaster_recovery} \
&& docker run -u $(id -u) --name etcd -e ETCD_DISABLE_PRESTOP=yes \
-e ETCD_ACTIVE_ENDPOINTS=does-not-matter \
-e ETCD_INITIAL_CLUSTER=http://localhost:2380,http://fake-down-server:2380 \
-e BITNAMI_DEBUG=yes -e ETCD_DISASTER_RECOVERY=yes \
-e ALLOW_NONE_AUTHENTICATION=yes --rm -it \
-v $PWD/etcd/snapshots:/snapshots \
-v $PWD/etcd/data:/bitnami/etcd/data \
bitnami/etcd:3.5.4-debian-11-r33
with the output:
etcd 18:41:47.60
etcd 18:41:47.62 Welcome to the Bitnami etcd container
etcd 18:41:47.65 Subscribe to project updates by watching https://github.com/bitnami/containers
etcd 18:41:47.67 Submit issues and feature requests at https://github.com/bitnami/containers/issues
etcd 18:41:47.69
etcd 18:41:47.72 INFO ==> ** Starting etcd setup **
etcd 18:41:47.84 INFO ==> Validating settings in ETCD_* env vars..
etcd 18:41:47.88 WARN ==> You set the environment variable ALLOW_NONE_AUTHENTICATION=yes. For safety reasons, do not use this flag in a production environment.
etcd 18:41:47.92 INFO ==> Initializing etcd
etcd 18:41:47.95 INFO ==> Generating etcd config file using env variables
etcd 18:41:48.17 INFO ==> Detected data from previous deployments
etcd 18:41:48.22 INFO ==> The member will try to join the cluster by it's own
/opt/bitnami/scripts/libetcd.sh: line 448: host: unbound variable
the same input with my fix renders the following:
docker run -u $(id -u) --name etcd -e ETCD_DISABLE_PRESTOP=yes -e ETCD_ACTIVE_ENDPOINTS=does-not-matter -e ETCD_INITIAL_CLUSTER=http://localhost:2380,http://fake-down-server:2380 -e BITNAMI_DEBUG=yes -e ETCD_DISASTER_RECOVERY=yes -e ALLOW_NONE_AUTHENTICATION=ye
s --rm -it -v $PWD/etcd/snapshots:/snapshots -v $PWD/etcd/data:/bitnami/etcd/data etcd-fix
etcd 18:44:19.23
etcd 18:44:19.26 Welcome to the Bitnami etcd container
etcd 18:44:19.28 Subscribe to project updates by watching https://github.com/bitnami/containers
etcd 18:44:19.31 Submit issues and feature requests at https://github.com/bitnami/containers/issues
etcd 18:44:19.33
etcd 18:44:19.35 INFO ==> ** Starting etcd setup **
etcd 18:44:19.48 INFO ==> Validating settings in ETCD_* env vars..
etcd 18:44:19.51 WARN ==> You set the environment variable ALLOW_NONE_AUTHENTICATION=yes. For safety reasons, do not use this flag in a production environment.
etcd 18:44:19.56 INFO ==> Initializing etcd
etcd 18:44:19.59 INFO ==> Generating etcd config file using env variables
etcd 18:44:19.82 INFO ==> Detected data from previous deployments
etcd 18:44:19.86 INFO ==> The member will try to join the cluster by it's own
etcd 18:44:20.21 DEBUG ==> Last member to recover from the disaster!
etcd 18:44:20.25 WARN ==> Cluster not responding!
etcd 18:44:20.31 ERROR ==> There was no snapshot to restore!
but if there was a valid snapshot, it would just keep going with that.

Hi @jaysonsantos,
The environment variable ETCD_INITIAL_CLUSTER https://etcd.io/docs/v3.1/op-guide/configuration/#--initial-cluster only supports one URL. Could you try with ETCD_ADVERTISE_CLIENT_URLS instead?
mkdir -p etcd/{snapshots,data} && echo does not matter | tee etcd/{data/member_id,snapshots/.disaster_recovery} \
&& docker run -u $(id -u) --name etcd \
-e ETCD_DISABLE_PRESTOP=yes \
-e ETCD_ACTIVE_ENDPOINTS=does-not-matter \
-e ETCD_ADVERTISE_CLIENT_URLS=http://localhost:2380,http://fake-down-server:2380 \
-e BITNAMI_DEBUG=yes \
-e ETCD_DISASTER_RECOVERY=yes \
-e ALLOW_NONE_AUTHENTICATION=yes \
--rm -it \
-v $PWD/etcd/snapshots:/snapshots \
-v $PWD/etcd/data:/bitnami/etcd/data bitnami/etcd:3.5.4-debian-11-r33
Hi there @Mauraza that config I got from a running container that was created by the helm chart, maybe it should always set as one value?
This is the place where it sets more than one url: https://github.com/bitnami/charts/blob/d36311748078c08e2ad5a8cc64b2c02007304636/bitnami/etcd/templates/statefulset.yaml#L194-L199
Hi @jaysonsantos,
that is right, for that you need to initialize the environment variable as ETCD_INITIAL_CLUSTER=one=http://localhost:2380,fake=http://fake-down-server:2380.
You can check this docker-compose as an example.
https://github.com/bitnami/containers/blob/013e48a91036db911a706a0ed4aa133de35ba772/bitnami/etcd/docker-compose-cluster.yml#L14
Hi yes but, the way I came up with those variables was to mimic the state that the helm chart renders the containers and the fix is to avoid that from happening. What happens there is, when disaster recovery is enabled and the server has to do it, it will break after that r11 version. In the end, the script was just a mean of showcasing the error and the fix.
Hi @jaysonsantos,
could you share the logs of the error and the values of the chart? I will try to reproduce
Hi @jaysonsantos,
could you share the logs of the error and the values of the chart? I will try to reproduce
Sure, I will try and deploy another instance of it and reproduce the error
This Pull Request has been automatically marked as "stale" because it has not had recent activity (for 15 days). It will be closed if no further activity occurs. Thank you for your contribution.
Due to the lack of activity in the last 5 days since it was marked as "stale", we proceed to close this Pull Request. Do not hesitate to reopen it later if necessary.