`validatePDSpec` does not validate `Spec.PD.Service`
Bug Report
What version of Kubernetes are you using?
Client Version: version.Info{Major:"1", Minor:"24", GitVersion:"v1.24.0", GitCommit:"4ce5a8954017644c5420bae81d72b09b735c21f0", GitTreeState:"clean", BuildDate:"2022-05-03T13:46:05Z", GoVersion:"go1.18.1", Compiler:"gc", Platform:"linux/amd64"}
Server Version: [version.Info]([[http://version.info/){Major](http://version.info/)%7BMajor)](http://version.info/)%7BMajor](http://version.info/)%7BMajor)):"1", Minor:"22", GitVersion:"v1.22.9", GitCommit:"6df4433e288edc9c40c2e344eb336f63fad45cd2", GitTreeState:"clean", BuildDate:"2022-05-19T19:53:08Z", GoVersion:"go1.16.15", Compiler:"gc", Platform:"linux/amd64"}
What version of TiDB Operator are you using?
TiDB Operator Version: [version.Info]([[http://version.info/){GitVersion](http://version.info/)%7BGitVersion)](http://version.info/)%7BGitVersion](http://version.info/)%7BGitVersion)):"v1.3.0-45+1470cfb46e1ffb-dirty", GitCommit:"1470cfb46e1ffb8bb86f74ba455865a95b825413", GitTreeState:"dirty", BuildDate:"2022-07-17T16:54:00Z", GoVersion:"go1.18.3", Compiler:"gc", Platform:"linux/amd64"}
What storage classes exist in the Kubernetes cluster and what are used for PD/TiKV pods?
$ kubectl get sc
NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE
standard (default) [rancher.io/local-path](http://rancher.io/local-path) Delete WaitForFirstConsumer false 40m
$ kubectl get pvc -n {tidb-cluster-namespace}
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
pd-advanced-tidb-pd-0 Bound pvc-b566858b-bf4e-4e33-b31e-5d7feb7397b1 10Gi RWO standard 10m
pd-advanced-tidb-pd-1 Bound pvc-df70980f-12cf-499f-8ad7-e41cac98c5d0 10Gi RWO standard 10m
pd-advanced-tidb-pd-2 Bound pvc-d41691d8-feb5-4e21-b282-1ece1851cffa 10Gi RWO standard 10m
tikv-advanced-tidb-tikv-0 Bound pvc-42e652d6-2400-4ae8-b790-cef8466e4566 100Gi RWO standard 10m
tikv-advanced-tidb-tikv-1 Bound pvc-5af08c43-e02d-433c-896a-b85ad568d1ca 100Gi RWO standard 10m
tikv-advanced-tidb-tikv-2 Bound pvc-652761b6-9fff-4080-b13d-9e364062cddc 100Gi RWO standard 10m
What's the status of the TiDB cluster pods?
$ kubectl get po -n {tidb-cluster-namespace} -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
advanced-tidb-discovery-6998694d4c-fmpjl 1/1 Running 0 41m 10.244.2.2 test-worker3 <none> <none>
advanced-tidb-pd-0 1/1 Running 0 11m 10.244.3.4 test-worker <none> <none>
advanced-tidb-pd-1 1/1 Running 0 11m 10.244.2.4 test-worker3 <none> <none>
advanced-tidb-pd-2 1/1 Running 0 11m 10.244.1.4 test-worker2 <none> <none>
advanced-tidb-tidb-0 2/2 Running 0 9m 10.244.3.7 test-worker <none> <none>
advanced-tidb-tidb-1 2/2 Running 0 9m 10.244.2.7 test-worker3 <none> <none>
advanced-tidb-tidb-2 2/2 Running 0 9m 10.244.1.7 test-worker2 <none> <none>
advanced-tidb-tikv-0 1/1 Running 0 10m 10.244.2.6 test-worker3 <none> <none>
advanced-tidb-tikv-1 1/1 Running 0 10m 10.244.3.6 test-worker <none> <none>
advanced-tidb-tikv-2 1/1 Running 0 10m 10.244.1.6 test-worker2 <none> <none>
tidb-controller-manager-6cc68f8949-52vwt 1/1 Running 0 11m 10.244.3.2 test-worker <none> <none>
tidb-scheduler-dd569b6b4-hj294 2/2 Running 0 11m 10.244.1.2 test-worker2 <none> <none>
What did you do?
We deployed a CR file as below with spec.pd.service.loadBalancerSourceRanges equals to an "invalidIP" (basically a random string which does not comply to correct IP format).
The CR file
apiVersion: pingcap.com/v1alpha1
kind: TidbCluster
metadata:
name: test-cluster
spec:
configUpdateStrategy: RollingUpdate
enableDynamicConfiguration: true
helper:
image: busybox:1.34.1
pd:
baseImage: pingcap/pd
config: "[dashboard]\n internal-proxy = true\n"
maxFailoverCount: 0
mountClusterClientSecret: true
replicas: 3
requests:
storage: 10Gi
service:
loadBalancerSourceRanges:
- invalidIP
pvReclaimPolicy: Retain
tidb:
baseImage: pingcap/tidb
config: "[performance]\n tcp-keep-alive = true\n"
maxFailoverCount: 0
replicas: 3
service:
externalTrafficPolicy: Local
type: NodePort
tikv:
baseImage: pingcap/tikv
config: 'log-level = "info"
'
maxFailoverCount: 0
mountClusterClientSecret: true
replicas: 3
requests:
storage: 100Gi
timezone: UTC
version: v5.4.0
What did you expect to see?
We expected to see an error message indicating that the IP we set was invalid, as we found in the source code that validateService is used to "validate LoadBalancerSourceRanges field from service", and an error message should be thrown if LoadBalancerSourceRanges contains invalid IP.
What did you see instead? This invalid IP was silently rejected and we did not see any error messages indicating why this IP was rejected.
Additional comment
We found that Service under PD is not validated when validating PD spec. And we consider it necessary to call validateService in validatePDSpec, which can validate Service.LoadBalancerSourceRanges under PD.