e2d icon indicating copy to clipboard operation
e2d copied to clipboard

add high scalability using learners and promotion

Open ChrisRx opened this issue 6 years ago • 0 comments

The latency for proposals to be accepted in the raft log increases as more members participate in the cluster. Currently, e2d allows for cluster sizes of 1, 3, or 5 only for several key reasons:

  1. Odd numbers are needed to prevent election results to be split down the middle when leader election occurs
  2. Provide toleration for 1 or 2 member failures for 3 and 5-member clusters, respectively, while balancing the need to keeping the latency low by not adding too many voting members
  3. Have a statically set number for the targeted cluster size so that any individual member can reason about quorum without having to communicate with other members.

The maximum being 5 was selected based upon recommendations directly from etcd's FAQ.

In etcd version 3.4, a new feature was added: learners. These are non-voting members that receive raft updates, and being able to add joining members past the maximum of 5 voting members as learners would allow for clusters to be of any size and not negatively affect quorum requirements (and therefore latency).

Learners should be added by default when a joining node receives information about a cluster containing more than 5 voting members at the time of joining. Learners should be promoted when a voting member fails and only by a healthy leader. In this case a healthy leader is one that is considered to be healthy by both being a cluster leader, and reasoning based upon membership and the targeted cluster size that it is a valid cluster leader.

ChrisRx avatar Oct 04 '19 17:10 ChrisRx