clickhouse-operator icon indicating copy to clipboard operation
clickhouse-operator copied to clipboard

feature request: read-only replicas service

Open mbrancato opened this issue 1 year ago • 7 comments

It would be nice if there was a way in the Operator to define two separate sets of replicas. One for read-write usage, and one for read-only. While these would all be replicas, part of a single shard, and technically all be read-write (from what I understand), the operator could help here be placing specific read-only labels on the CHI that are for Select queries. Then a separate Kubernetes Service resource that only points to those CHI only could be used.

I know this a simple explanation, and gets more complex with multiple shards. There are some ways to do this in the existing operator, but a nice interface in the CRD would be great.

mbrancato avatar Aug 29 '24 23:08 mbrancato

It doesn't look like you can work with multiple service templates, but according to the docs it looks like you can work with multiple pod templates: https://github.com/Altinity/clickhouse-operator/blob/master/docs/custom_resource_explained.md#clusters-and-layouts

You could define 2 custom pod templates for your two sets of replicas, applying diverging labels between the two. You could have your clickhouse-operator-managed service template select one of those label values, and manually manage another service that selects the other.

gregakinman avatar Mar 25 '25 21:03 gregakinman

@sunsingerus looks like you're the main contributor at the moment - is this something that aligns with what clickhouse-operator would want to support? Essentially, allowing specifying a per-replica serviceTemplate in the layouts section, similar to how you can specify a per-replica podTemplate.

clickhouse-operator could either inject its own labels in order to ensure that the corresponding services select the correct pods, or that could be left up to the user (i.e., user is responsible for setting up labels in their different pod templates that match up with the selectors they create in their different service templates).

gregakinman avatar Mar 25 '25 21:03 gregakinman

https://edgedelta.com/company/blog/clickhouse-read-write-pod-separation

Let’s take a look at the overall ClickHouse architecture:

  1. Deploy ClickHouse to Kubernetes using Altinity Kubernetes Operator for ClickHouse.
  2. Create two explicitly different templates for read and write pods.
  3. Label the nodes as either read or write.
  4. Ensure the read pod template includes the option always_fetch_merged_part: true because we want merges to happen only on the write pods.
  5. Create two services to select on read and write pods.
  6. Route our read requests to read service and write requests to write service.

I'm guessing it's either a) implied in step 5 above that these services are not managed by clickhouse-operator or b) the author of that post knows how to do it using clickhouse-operator and we don't.

gregakinman avatar Mar 26 '25 16:03 gregakinman

@gregakinman, @mbrancato , it is of course possible to two define separate pod templates for replicas and corresponding separate service templates, like this:

spec:
  configuration:
    clusters:
    - layout:
        replicas:
        - templates:
            podTemplate: clickhouse-replica-1
            replicaServiceTemplate: replica-service-template-1
        - templates:
            podTemplate: clickhouse-replica-2
            replicaServiceTemplate: replica-service-template-2
        shardsCount: 1

In fact, replica services are created automatically by default (you do not need separate templates), but you may override them. The service templates will look like this:

    serviceTemplates:
    - generateName: clickhouse-{chi}
      name: default-service-template
      spec:
        ports:
        - name: http
          port: 8123
        - name: client
          port: 9000
        type: ClusterIP
    - generateName: chi-{chi}-{cluster}-{shard}-{replica}
      name: replica-service-template-1
      spec:
        ports:
        - name: http
          port: 8123
        - name: client
          port: 9000
        - name: replica
          port: 9009
        type: ClusterIP
    - generateName: chi-{chi}-{cluster}-{shard}-{replica}
      name: replica-service-template-2
      spec:
        ports:
        - name: http
          port: 8123
        - name: client
          port: 9000
        - name: replica
          port: 9009
        type: ClusterIP

alex-zaitsev avatar Apr 04 '25 08:04 alex-zaitsev

Hi @mbrancato, do you mind to share how you set always_fetch_merged_part for those read only replicas?

boqu avatar Apr 21 '25 07:04 boqu

Is there a way to setup a service which can utilise read nodes but fall back to write nodes, if reads are unavailable. So that CH svc is up in all the cases.

himadrisingh avatar Jul 21 '25 06:07 himadrisingh

@himadrisingh, this has no sense. Clickhouse have active-active multi-master topology it means all data which you write to one replica will write in other replicas in shards, instead of split read\write workload better use load balancing to better utilization

Slach avatar Jul 21 '25 11:07 Slach