clickhouse-operator icon indicating copy to clipboard operation
clickhouse-operator copied to clipboard

How to deploy clickhouse along with clickhouse-keeper using the clickhouse operator?

Open kant777 opened this issue 3 years ago • 14 comments

Currently the docs still point to Zookeeper and also says Zookeeper is required while Clickhouse 22.3 says Clickhouse-Keeper is production ready! I do see some k8's files for Clickhouse-Keeper but that is sort of implying to run Clickhouse-Keeper cluster separately just like ZK.

I am looking to run clickhouse cluster along with clickhouse keeper embedded into some of the Clickhouse nodes in the cluster. Would be great to have an example using k8's directly or through clickhouse operator?

kant777 avatar Jun 15 '22 12:06 kant777

It's a bit heavy on configuration but it is possible. I'm hoping the operator will get support for this pretty soon but can be manually added.

Here's what I did to get it working:

  • In the replicaServiceTemplate you'll need to expose ports for zookeeper (2181) as well as raft (9444)
  • In the serviceTemplate you'll just need to also expose 2181
  • Set your zookeper host to the LoadBalancer/ClusterIP that the operator creates (and that you have exposed port 2181 on)
  • You then need to customize each pod so that the pods you want contain a <keeper_server> configuration that has the correct <server_id> for that pod as well as the raft configuration pointing to all the other instances that will be running clickhouse keeper

alexvanolst avatar Jun 29 '22 06:06 alexvanolst

@alexvanolst Is there somewhere an example? I cannot follow on what to do in the last step.

rgarcia89 avatar Jul 18 '22 13:07 rgarcia89

I haven't fully tested this yet, but initially it seems to work. Sorry for the bad formatting here, but thought I'd share it quickly so others can at least play around with this. The value for the Zookeeper server is set as below:

servers:
    - host: service-posthog-clickhouse-0-0
      port: 9181
    - host: service-posthog-clickhouse-0-1
      port: 9181
    - host: service-posthog-clickhouse-0-2
      port: 9181

This host is generated from the serviceTemplate service-template (the one with generateName: service-{chi}).

I based this off of the example yaml from here. It can use some cleanup (like creating the config file in an init container instead with a shared volume) but I haven't gotten to that yet.

{{- if .Values.clickhouse.enabled }}
apiVersion: "clickhouse.altinity.com/v1"
kind: "ClickHouseInstallation"
metadata:
  name: {{ template "posthog-plural.name" . }}-clickhouse
spec:
  defaults:
    templates:
      serviceTemplate: service-template
      replicaServiceTemplate: replica-service-template
  configuration:
    users:
      {{- template "clickhouse.passwordValue" . }}
      {{ .Values.clickhouse.user }}/networks/ip:
        {{- range $.Values.clickhouse.allowedNetworkIps }}
        - {{ . | quote }}
        {{- end }}
      {{ .Values.clickhouse.user }}/profile: default
      {{ .Values.clickhouse.user }}/quota: default
      {{- if .Values.clickhouse.backup.enabled }}
      {{ .Values.clickhouse.backup.backup_user }}/networks/ip: "0.0.0.0/0"
      {{ template "clickhouse.backupPasswordValue" . }}
      {{- end}}
      {{- if .Values.clickhouse.additionalUsersConfig }}
      {{- .Values.clickhouse.additionalUsersConfig | toYaml | nindent 6 }}
      {{- end}}
    profiles:
      {{- merge dict .Values.clickhouse.profiles .Values.clickhouse.defaultProfiles | toYaml | nindent 6 }}

    clusters:
      - name: {{ .Values.clickhouse.cluster | quote }}
        templates:
          podTemplate: pod-template
          clusterServiceTemplate: cluster-service-template
          {{- if and (.Values.clickhouse.persistence.enabled) (not .Values.clickhouse.persistence.existingClaim) }}
          dataVolumeClaimTemplate: data-volumeclaim-template
          {{- end }}
        layout:
          {{- toYaml .Values.clickhouse.layout | nindent 10 }}

    settings:
      {{- merge dict .Values.clickhouse.settings .Values.clickhouse.defaultSettings | toYaml | nindent 6 }}

    files:
      events.proto: |
        syntax = "proto3";
        message Event {
          string uuid = 1;
          string event = 2;
          string properties = 3;
          string timestamp = 4;
          uint64 team_id = 5;
          string distinct_id = 6;
          string created_at = 7;
          string elements_chain = 8;
        }

    zookeeper:
      nodes:
      {{- if .Values.clickhouse.externalZookeeper }}
        {{- toYaml .Values.clickhouse.externalZookeeper.servers | nindent 8 }}
      {{- end }}

  templates:
    podTemplates:
      - name: pod-template
          {{- if .Values.clickhouse.podAnnotations }}
        metadata:
          annotations: {{ toYaml .Values.clickhouse.podAnnotations | nindent 12 }}
          {{- end }}
        {{- if .Values.clickhouse.podDistribution }}
        podDistribution: {{ toYaml .Values.clickhouse.podDistribution | nindent 12 }}
        {{- end}}
        spec:
          {{- if .Values.clickhouse.affinity }}
          affinity: {{ toYaml .Values.clickhouse.affinity | nindent 12 }}
          {{- end }}
          {{- if .Values.clickhouse.tolerations }}
          tolerations: {{ toYaml .Values.clickhouse.tolerations | nindent 12 }}
          {{- end }}
          {{- if .Values.clickhouse.nodeSelector }}
          nodeSelector: {{ toYaml .Values.clickhouse.nodeSelector | nindent 12 }}
          {{- end }}
          {{- if .Values.clickhouse.topologySpreadConstraints }}
          topologySpreadConstraints: {{ toYaml .Values.clickhouse.topologySpreadConstraints | nindent 12 }}
          {{- end }}

          {{- if .Values.clickhouse.securityContext.enabled }}
          securityContext: {{- omit .Values.clickhouse.securityContext "enabled" | toYaml | nindent 12 }}
          {{- end }}

          {{- if .Values.clickhouse.image.pullSecrets }}
          imagePullSecrets:
            {{- range .Values.clickhouse.image.pullSecrets }}
            - name: {{ . }}
            {{- end }}
          {{- end }}

          containers:
            - name: clickhouse
              image: {{ template "posthog.clickhouse.image" . }}
              env:
              command:
                - /bin/bash
                - -c
                - /usr/bin/clickhouse-server --config-file=/etc/clickhouse-server/config.xml
              ports:
                - name: http
                  containerPort: 8123
                - name: client
                  containerPort: 9000
                - name: interserver
                  containerPort: 9009
              {{- if .Values.clickhouse.persistence.enabled }}
              volumeMounts:
              {{- if .Values.clickhouse.persistence.existingClaim }}
                - name: existing-volumeclaim
              {{- else }}
                - name: data-volumeclaim-template
              {{- end }}
                  mountPath: /var/lib/clickhouse
              {{- end }}

              {{- if .Values.clickhouse.resources }}
              resources: {{ toYaml .Values.clickhouse.resources | nindent 16 }}
              {{- end }}
            {{- if .Values.clickhouse.backup.enabled }}
            - name: clickhouse-backup
              image: {{ template "posthog_backup.clickhouse.image" . }}
              imagePullPolicy: {{ .Values.clickhouse.backup.image.pullPolicy }}
              command:
                - /bin/bash
                - -c
                - /bin/clickhouse-backup server
              {{- with .Values.clickhouse.backup.env }}
              env:
                {{- toYaml . | nindent 16 }}
              {{- end}}
              ports:
                - name: backup-rest
                  containerPort: 7171
            {{- end }}
      - name: pod-template-clickhouse-keeper
          {{- if .Values.clickhouse.podAnnotations }}
        metadata:
          annotations: {{ toYaml .Values.clickhouse.podAnnotations | nindent 12 }}
          {{- end }}
        {{- if .Values.clickhouse.podDistribution }}
        podDistribution: {{ toYaml .Values.clickhouse.podDistribution | nindent 12 }}
        {{- end}}
        spec:
          {{- if .Values.clickhouse.affinity }}
          affinity: {{ toYaml .Values.clickhouse.affinity | nindent 12 }}
          {{- end }}
          {{- if .Values.clickhouse.tolerations }}
          tolerations: {{ toYaml .Values.clickhouse.tolerations | nindent 12 }}
          {{- end }}
          {{- if .Values.clickhouse.nodeSelector }}
          nodeSelector: {{ toYaml .Values.clickhouse.nodeSelector | nindent 12 }}
          {{- end }}
          {{- if .Values.clickhouse.topologySpreadConstraints }}
          topologySpreadConstraints: {{ toYaml .Values.clickhouse.topologySpreadConstraints | nindent 12 }}
          {{- end }}

          {{- if .Values.clickhouse.securityContext.enabled }}
          securityContext: {{- omit .Values.clickhouse.securityContext "enabled" | toYaml | nindent 12 }}
          {{- end }}

          {{- if .Values.clickhouse.image.pullSecrets }}
          imagePullSecrets:
            {{- range .Values.clickhouse.image.pullSecrets }}
            - name: {{ . }}
            {{- end }}
          {{- end }}

          containers:
            - name: clickhouse
              image: {{ template "posthog.clickhouse.image" . }}
              env:
              - name: KEEPER_SERVERS
                value: {{ .Values.clickhouse.layout.replicasCount | quote }}
              - name: RAFT_PORT
                value: "9444"
              command:
                - /bin/bash
                - -c
                - |
                  HOST=`hostname -s` &&
                  DOMAIN=`hostname -d` &&
                  if [[ $HOST =~ (.*)-([0-9]+)-([0-9]+)$ ]]; then
                      NAME=${BASH_REMATCH[1]}
                      ORD=${BASH_REMATCH[2]}
                      SUFFIX=${BASH_REMATCH[3]}
                  else
                      echo "Failed to parse name and ordinal of Pod"
                      exit 1
                  fi &&
                  if [[ $DOMAIN =~ (.*)-([0-9]+)(.posthog.svc.cluster.local)$ ]]; then
                      DOMAIN_NAME=${BASH_REMATCH[1]}
                      DOMAIN_ORD=${BASH_REMATCH[2]}
                      DOMAIN_SUFFIX=${BASH_REMATCH[3]}
                  else
                      echo "Failed to parse name and ordinal of Pod"
                      exit 1
                  fi &&
                  export MY_ID=$((ORD+1)) &&
                  mkdir -p /tmp/clickhouse-keeper/config.d/ &&
                  {
                    echo "<yandex><keeper_server>"
                    echo "<server_id>${MY_ID}</server_id>"
                    echo "<raft_configuration>"
                    for (( i=1; i<=$KEEPER_SERVERS; i++ )); do
                        echo "<server><id>${i}</id><hostname>$NAME-$((i-1))-${SUFFIX}.${DOMAIN_NAME}-$((i-1))${DOMAIN_SUFFIX}</hostname><port>${RAFT_PORT}</port></server>"
                    done
                    echo "</raft_configuration>"
                    echo "</keeper_server></yandex>"
                  } > /tmp/clickhouse-keeper/config.d/generated-keeper-settings.xml &&
                  cat /tmp/clickhouse-keeper/config.d/generated-keeper-settings.xml &&
                  /usr/bin/clickhouse-server --config-file=/etc/clickhouse-server/config.xml

              ports:
                - name: http
                  containerPort: 8123
                - name: client
                  containerPort: 9000
                - name: interserver
                  containerPort: 9009
                - name: raft
                  containerPort: 9444
                - name: ch-keeper
                  containerPort: 9181
              {{- if .Values.clickhouse.persistence.enabled }}
              volumeMounts:
              {{- if .Values.clickhouse.persistence.existingClaim }}
                - name: existing-volumeclaim
              {{- else }}
                - name: data-volumeclaim-template
              {{- end }}
                  mountPath: /var/lib/clickhouse
              {{- end }}
              # configures probes for clickhouse keeper
              # without this, traffic is not sent through the service and clickhouse keeper cannot start
              readinessProbe:
                tcpSocket:
                  port: 9444
                initialDelaySeconds: 10
                timeoutSeconds: 5
                periodSeconds: 10
                failureThreshold: 3
              livenessProbe:
                tcpSocket:
                  port: 9181
                initialDelaySeconds: 30
                timeoutSeconds: 5
                periodSeconds: 10

              {{- if .Values.clickhouse.resources }}
              resources: {{ toYaml .Values.clickhouse.resources | nindent 16 }}
              {{- end }}
            {{- if .Values.clickhouse.backup.enabled }}
            - name: clickhouse-backup
              image: {{ template "posthog_backup.clickhouse.image" . }}
              imagePullPolicy: {{ .Values.clickhouse.backup.image.pullPolicy }}
              command:
                - /bin/bash
                - -c
                - /bin/clickhouse-backup server
              {{- with .Values.clickhouse.backup.env }}
              env:
                {{- toYaml . | nindent 16 }}
              {{- end}}
              ports:
                - name: backup-rest
                  containerPort: 7171
            {{- end }}

    serviceTemplates:
      - name: service-template
        generateName: service-{chi}
        spec:
          ports:
            - name: http
              port: 8123
            - name: tcp
              port: 9000
            - name: clickhouse-keeper
              port: 9181
          type: {{ .Values.clickhouse.serviceType }}
      - name: cluster-service-template
        generateName: service-{chi}-{cluster}
        spec:
          ports:
            - name: http
              port: 8123
            - name: tcp
              port: 9000
          type: ClusterIP
          clusterIP: None
      - name: replica-service-template
        generateName: service-{chi}-{shard}-{replica}
        spec:
          ports:
            - name: http
              port: 8123
            - name: tcp
              port: 9000
            - name: interserver
              port: 9009
          type: ClusterIP
      - name: replica-service-template-clickhouse-keeper
        generateName: service-{chi}-{shard}-{replica}
        spec:
          ports:
            - name: http
              port: 8123
            - name: tcp
              port: 9000
            - name: interserver
              port: 9009
            - name: clickhouse-keeper
              port: 9181
            - name: raft
              port: 9444
          type: ClusterIP

    {{- if and (.Values.clickhouse.persistence.enabled) (not .Values.clickhouse.persistence.existingClaim) }}
    volumeClaimTemplates:
      - name: data-volumeclaim-template
        spec:
          {{- if .Values.clickhouse.persistence.storageClass }}
          storageClassName: {{ .Values.clickhouse.persistence.storageClass }}
          {{- end }}
          accessModes:
            - ReadWriteOnce
          resources:
            requests:
              storage: {{ .Values.clickhouse.persistence.size | quote }}
    {{- end }}

{{- end }}

davidspek avatar Dec 20 '22 21:12 davidspek

@DavidSpek That looks interesting. Is it part of a larger helm chart you could share? Thanks!

spoofedpacket avatar Feb 23 '23 14:02 spoofedpacket

@spoofedpacket Along the way we discovered some issues with the original solution I posted above, but I have now edited it. It's part of our helm chart for deploying PostHog using Plural. The current setup is working well and we are using it to run our PostHog production deployment. The template file for the ClickHouse instance can be found here, with the values being set here.

Some interesting things to note. To allow for increasing the amount of shards for scaling, we are using dedicated pod and service templates for the ClickHouse-Keeper replicas see here and here, and adding the ClickHouse-Keeper configuration file along with the special templates are being set for the 3 replicas of the first shard here.

davidspek avatar Mar 06 '23 11:03 davidspek

@spoofedpacket @alexvanolst I've since made a small helm chart that allows you to easily deploy clickhouse using the operator with support for using the built-in clickhouse keeper. See https://github.com/pluralsh/module-library/tree/main/helm/clickhouse

davidspek avatar May 19 '23 13:05 davidspek

@DavidSpek according to https://github.com/pluralsh/module-library/blob/main/helm/clickhouse/templates/clickhouse_instance.yaml#L282

is your embedded clickhouse-keeper installation stable now? did you compare performance for zookeeper vs clickhouse-keeper?

Slach avatar May 19 '23 14:05 Slach

@Slach It seems to be working correctly and we haven't had any issues with it. However, I'm not a clickhouse expert nor have I had the time to compare performance with Zookeeper. I'm assuming upstream tests and performance evaluations will still be valid for this configuration. I do welcome any help with testing this setup from more experienced clickhouse users. Last night I've actually thought about how this could be handled be the operator so when I have the time I might look into implementing this.

davidspek avatar May 19 '23 14:05 davidspek

@DavidSpek roger that. Anyway, thank you for your efforts! ;-)

Slach avatar May 19 '23 14:05 Slach

@Slach Would you be open to a contribution that applies a similar configuration in the operator? There it would be possible to do some more advanced logic in terms of distributing the keeper instances across the nodes?

davidspek avatar May 19 '23 14:05 davidspek

Yes we open to contributions,

kind: ClickHouseKeeper should be implements as a separate CRD Each instance of clickhouse-keeper shall be deployed as separate statefulset (to allow separate manage) and could be link inside kind: ClickHouseInstallation and kind: ClickHouseInstallationTemplate CRDs

Slach avatar May 19 '23 20:05 Slach

@Slach Sorry for not getting back to you quicker about this. While I see the value of having a separate CRD for ClickHouseKeeper, would you also be open to adding the functionality for running an embedded ClickHouse Keeper within a ClickHouse installation? For me the main benefit of ClickHouse Keeper is less maintenance and resource overhead since it doesn't require dedicated pods.

davidspek avatar Jul 07 '23 09:07 davidspek

@DavidSpek embedded clickhouse-keeper will restrict your scalability

because you need less clickhouse-keeeper instances, then clickhouse-server instances

for example, typical installations clickhouse-server two replicas per each shard, replicas could be in different DC

and usual clickhouse-keeper installation 1 or 3 instances to quick quorum odd number of keeper instances per datacenter, to avoid split brain situation

Slach avatar Jul 07 '23 11:07 Slach

@Slach I am aware of that, and also the fact that in almost all cases you wouldn't want to scale Raft past 5 or 7 nodes due to performance issues. However, the operator could implement some smart logic in terms how many clickhouse-keeper nodes to run and how those are spread across the regular clickhouse nodes. As a quick example:

If nodes < 3, run 1 keeper If nodes >= 3 and <= 5, run 3 keepers

And then some more comparisons can be done in terms of if the keepers can be split nicely across shards or replicas. It could even take the affinity rules into account so the nodes running keeper are in separate AZs.

davidspek avatar Aug 09 '23 23:08 davidspek