materialize icon indicating copy to clipboard operation
materialize copied to clipboard

graceful reconfiguration v1 design doc

Open jubrad opened this issue 1 year ago • 2 comments

Motivation

Design doc for graceful reconfiguration of managed clusters. Currently limited to timeout based reconfiguration delay (v1).

https://github.com/MaterializeInc/materialize/issues/20010

Tips for reviewer

Checklist

  • [ ] This PR has adequate test coverage / QA involvement has been duly considered. (trigger-ci for additional test/nightly runs)
  • [x] This PR has an associated up-to-date design doc, is a design doc (template), or is sufficiently small to not require a design.
  • [ ] If this PR evolves an existing $T ⇔ Proto$T mapping (possibly in a backwards-incompatible way), then it is tagged with a T-proto label.
  • [ ] If this PR will require changes to cloud orchestration or tests, there is a companion cloud PR to account for those changes that is tagged with the release-blocker label (example).
  • [ ] This PR includes the following user-facing behavior changes:

jubrad avatar Apr 23 '24 20:04 jubrad

Thanks of writing this design doc! I want to capture one thought I had, but don't feel obliged in any way to incorporate it in your design! In the past we had the idea of replica sets, which groups replicas in a cluster based on properties, and allows to reason about each independently of the cluster.

In the case of graceful reconfiguration, this could argue about replica sets instead of specific replicas, which might make the reasoning simpler. For example, it avoids having to think about how to name pending replicas, because each replica set provides a namespace.

For context, we didn't implement replica sets last year because it was a large change that complicated a user-facing feature even further, but maybe it's time to reconsider this eventually? (https://github.com/MaterializeInc/materialize/pull/21351)

antiguru avatar Jun 06 '24 18:06 antiguru

@antiguru very cool! Conceptually I like the idea of using replica sets to manage groups of replicas for graceful reconfiguration. I suspect we'd have to do something swap out the default/implicit replica set for the reconfiguration replica set once we've met our check/finalization conditions then tear down the pre-alter replica set.

It seems like, for graceful reconfiguration, we would only want to act on the implicit replica set, and we would not support any graceful reconfiguration feature for replica set sets themselves.

For the sake of this design doc, and roadmap of the feature, I think it's likely a good idea to hold off on incorporating this at least for v1. I also suspect incorporating replica sets wouldn't broadly change the design, but would clean up some of the individual replica management.

jubrad avatar Jun 08 '24 03:06 jubrad