tispark
tispark copied to clipboard
Support follower read
Problem
Now, TiSpark only supports reading data from the TiKV leader which may affect OLTP.
In order to isolate the traffic of OLAP and OLTP, we need to read from the follower TiKV when we perform OLTP with TiSpark.
Goals
Support follower read
Solutions
We will add the following configs:
- spark.tispark.replica_read:Read data from specified role. The optional roles are leader, follower and learner. You can also specify multiple roles, and we will pick the roles you specify in order.
- spark.tispark.replica_read.label:Only select TiKV store match specified labels. Format: label_x=value_x,label_y=value_y
- spark.tispark..replica_read.address_whitelist:Only select TiKV store with given IP addresses.
- spark.tispark..replica_read.address_blacklist:Do not select TiKV store with given IP addresses.
Here are some references for my investigate:
- TiDB has supported the
follower readwith strongly consistent reads, so it is possible to support it in TiSpark too. - client-java has added the replica selector in https://github.com/tikv/client-java/pull/151 and https://github.com/tikv/client-java/pull/171 so that it has the ability to select roles of TiKV.
And here is the main step of the implement
- Pass and get the above configs
- Add a new class
ReplicaReadPolicywhich implements the ReplicaSelector interface in client-java - Overrides the
selectmethod and applies the label, whitelist and blacklist configs - Pass the
ReplicaReadPolicytoTiConfiguration - Create
TiSessionwith theTiConfiguration
Design
https://github.com/pingcap/tispark/pull/2546
Implementation
https://github.com/pingcap/tispark/pull/2546
Doc
https://github.com/pingcap/tispark/pull/2546
Cool.