documents icon indicating copy to clipboard operation
documents copied to clipboard

Introducing break the glass as a principle

Open grmhay opened this issue 4 years ago • 7 comments

We (representing Morgan Stanley) believe that the situation where the source of truth for desired state (e.g. github.com or a git-equivalent that an enterprise may run - but recognizing there are central and decentralized approaches for storing desired state) is less available than your users' expected SLA for making configuration changes is being left by the community as an issue for the implementer to overcome. Put succinctly, if (in our example) Github is unavailable and you want to make changes to your System State, there should be one approach and a set of tooling to allow reconciliation after the fact. A further example exists in disconnected systems (e.g. Kubernetes on a ship) where the system may be disconnected from the store where the desired state resides. How then would the system state be updated in an emergency and then reconciled with the desired state? This will both harm adoption of gitops and is inefficient as I believe we shared a common challenge that we can solve once within the project. The first step, as this project has so well established, is a glossary of terms to allow us to describe the problem and a draft principle to add. I have included these in this PR.

grmhay avatar Oct 20 '21 02:10 grmhay

@grmhay thanks for this PR! To get more discussion on this, you may want to:

  1. add this to the next WG meeting agenda?
  2. threaded discussions may help. The place to do start would be here: https://github.com/open-gitops/project/discussions
  3. to help put this into context with previous/existing discussion, add a list of links to previous PRs and comments as needed. I could help with that if it's useful
    • Also @lloydchang did a lot of this linking already in his comments on this closed PR https://github.com/open-gitops/documents/pull/38

scottrigby avatar Oct 21 '21 18:10 scottrigby

My gut response is perhaps principle 3 could somehow address that agents should be able to pull the manifests from the source WHENEVER NEEDED (not just when a CI job runs, or as you said limited to uptime of your source of truth).

We removed the "break glass" glossary item temporarily, because:

  • The principles or other glossary items were revised and no longer mention this, so it was an orphaned glossary item
  • Less importantly (but still a factor), because "break glass" starts with "B" it was the first thing people read about when opening the glossary – which tells them when NOT to do GitOps (or when to pause, etc). Even though it should be mentioned somewhere (perhaps best practices?) leading with this seemed like not putting our best foot forward

What about something like "whenever needed" to principle 3?

3. **Pulled Automatically**
-    Software agents automatically pull the desired state declarations from the source.
+    Software agents automatically pull the desired state declarations from the source <whenever needed>.

Then link "whenever needed" to a glossary item about source uptime, which could then link to your "Intermediate State Store" item and perhaps some version of the former "break glass" glossary item?

scottrigby avatar Oct 21 '21 18:10 scottrigby

My two cents: I don't think "break glass" is something that should be a principal.

This is something that can be a "best practice" or "operating model" or a "white paper"

Break glass is too specific for these principals, which is meant to be open ended.

christianh814 avatar Oct 27 '21 18:10 christianh814

Ah really interesting idea @grmhay! You make some really good points.

Couple things

  1. The format of the principles is "The desired state of a GitOps managed system must be:" so the formatting would need to fit within that framework. Something like "Always appliable" or something that would match.
  2. The basic concern is "what if I need to deploy something and GitOps isn't working" in which case I need to be able to do manual processes, potentially outside of git to make things work again. I think we can all understand that this might happen but is it something that should be covered by GitOps itself? If you're having to break glass every week then we may not really have achieved GitOps right? I could
  3. I think we should develop the idea of break glass policy as a whitepaper within the documents repo in the meantime.

todaywasawesome avatar Oct 27 '21 18:10 todaywasawesome

Also cross-liking older discussion https://github.com/open-gitops/project/discussions/86

scottrigby avatar Jan 17 '22 23:01 scottrigby

Just revisiting this. @grmhay Would you want to close this and open a "best practice" or "white paper" PR?

christianh814 avatar Feb 19 '22 23:02 christianh814

Hi

White paper is probably the best label as I think best practice might be a bit presumptuous of the reader’s situation.

Graeme

On Sat, Feb 19, 2022 at 4:08 PM Christian Hernandez < @.***> wrote:

Just revisiting this. @grmhay https://github.com/grmhay Would you want to close this and open a "best practice" or "white paper" PR?

— Reply to this email directly, view it on GitHub https://github.com/open-gitops/documents/pull/40#issuecomment-1046121347, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABWZDOJDOULS3627Y2VLL6LU4APFHANCNFSM5GKQWP2A . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

You are receiving this because you were mentioned.Message ID: @.***>

grmhay avatar Feb 23 '22 22:02 grmhay

Any updates or progress with this?

williamcaban avatar Mar 13 '23 23:03 williamcaban

Asked again in this Slack thread.

scottrigby avatar Mar 17 '23 05:03 scottrigby

There has been recent - and excellent! - discussion on the reviews/comments in this PR. We have a discussion item for this here https://github.com/open-gitops/project/discussions/86. Could one of you please summarize the above conversation and move that into that the discussion linked here? That way we can keep this conversation alive even while I'll now close this PR.

BTW, We'll be using those discussion topics as the basis for this KubeCon EU OpenGitOps project meeting in Amsterdam.

scottrigby avatar Apr 07 '23 16:04 scottrigby