actions-runner-controller icon indicating copy to clipboard operation
actions-runner-controller copied to clipboard

Installing the ARC on multiple orgs using GH App authentication

Open casanova-21 opened this issue 4 years ago • 12 comments

Hi, team. 👋 I'm working on setting the ARC up for a high profile customer, and we're running into an issue that I'm not sure how to get past.

We have successfully configured the ARC for a single org using the GH App authentication method. We're using the webhook on the GH App to trigger the autoscaling of the runner pods. We're now trying to enable this for their other orgs, so we made the app public and the installed it on a second org. We then deployed a second runner-deployment.yaml that contains the second org, and I can see in the "describe runner" output that it is unable to update the registration token for this other org. I can also see the "403 Resource not accessible by integration" error in the controller pod logs. I believe this is due to the fact that GH App installation ID for the second org is unique and doesn't match the installation ID that is stored in the controller-manager secret in the actions-runner-system namespace.

What is the proper method of getting this installed across multiple orgs? My customer does not want to use PAT authentication to avoid having to rotate the service account every six months that would be associated with the PAT. And as far as I'm aware, you can't install the GH App at the enterprise level.

casanova-21 avatar Jan 21 '22 19:01 casanova-21

Deploy a controller per organisation https://github.com/actions-runner-controller/actions-runner-controller#deploying-multiple-controllers, ARC only takes a single GitHub App configuration and as you've seen, the installtion ID is unique per install.

toast-gear avatar Jan 21 '22 21:01 toast-gear

@toast-gear Thanks for the response! I did read through that section on deploying multiple controllers. Our customers that want to adopt the ARC at scale are likely going to be unwilling to run a separate controller for each of their orgs. We've got customers in some cases with hundreds of orgs, and a controller per org will be too much overhead for them. Is the PAT authentication going to be their only option then? If that is the case, we'll need some kind of feature that allows the GH App method to scale better. My customer doesn't want to use PAT authentication as the PAT will be tied to a service account which will need to be rotated every six months. Their concerned about incurring downtime when that rotation occurs.

I looked at the Runner Deployment deployment file and it appears that "organization" is a string as opposed to an array which implies to me that you can only ever specify a single org per runner deployment. Is the best option here going to be a Runner Group?

casanova-21 avatar Jan 24 '22 15:01 casanova-21

Is the best option here going to be a Runner Group?

Sorry, could you clarify your question regarding Runner Groups? Are you asking if you could use Runner Groups to share runners across orgs?

cbui avatar Feb 03 '22 01:02 cbui

@cbui Yes, sir. You can create Runner Groups at the enterprise level. But I think ultimately even though you may share that runner group with multiple orgs, the controller is still going to need to be able to authenticate with each org which is going to require the secret that contains the application installation ID. I believe that application installation ID that's part of the secret is required for the authentication so it has to be used.

casanova-21 avatar Feb 03 '22 15:02 casanova-21

@toast-gear Regarding the Installation ID requirement when installing the controller per org, would it be possible to use the List installations for App API to get the installations and then maybe get the installation ids for the corresponding orgs? It seems that API can be accessed by authenticating directly as the app instead of an installation so using a private key attached to the GH App would work (which being common for all the orgs would only need to be specified once)

I'm not sure but I think the philips-labs aws runner might be using this approach since they rely on a private key when using the App as well

FearlessHyena avatar Feb 07 '22 08:02 FearlessHyena

We've got customers in some cases with hundreds of orgs, and a controller per org will be too much overhead for them

Implementation-wise, one would need to enhance ARC to internally have multiple GitHub API clients, perhaps one per organization or repository to support this use-case. ARC's current design assumes one can call any GitHub APIs required to manage runners with a single GitHub client that shares the same PAT or App creds you've passed to ARC.

Maybe we'd better enhance RunnerDeployment and RunnerSet spec to be able to refer from which K8s secret ARC needs to load GitHub API credentials(either PAT or App's)?

Examples:

kind: RunnerDeployment
spec:
  repository: myorg1/myrepo2
  githubAPICredentialsFrom:
    secretName: pat-has-access-to-myrepo2

---
kind: RunnerDeployment
spec:
  organization: myorg2
  githubAPICredentialsFrom:
    secretName: app-creds-and-installation-id-for-myorg2

---
kind: RunnerDeployment
spec:
  organization: myorg3
  githubAPICredentialsFrom:
    secretName: app-creds-and-installation-id-for-myorg3

---
kind: RunnerDeployment
spec:
  enterprise: myenterprise1
  githubAPICredentialsFrom:
    secretName: pat-of-enterprise-admin-with-admin-priv

mumoshu avatar Feb 18 '22 01:02 mumoshu

@mumoshu Thanks for the response! This would definitely be an improvement over the current situation. We are still going to have the challenge on the GitHub side with the installation of the GitHub app on each org being a manual process as there's not an API for it. And GitHub doesn't have enterprise level apps so it has to be done on each org. Those obviously aren't direct ARC problems but contributes to the overall challenge of how to get ARC distributed across an enterprise that is using many orgs.

ARC platform admins would also need to build a mechanism to automate the creation of the secrets within the cluster and also get their runnerdeployment.yaml populated/updated with each secret name. Fortunately that's not as challenging to deal with as the required manual process for creating the GH app on each org though.

casanova-21 avatar Feb 22 '22 22:02 casanova-21

build a mechanism to automate the creation of the secrets within the cluster and also get their runnerdeployment.yaml populated/updated with each secret name

@casanova-21 Hey! Just curious, but does the mentioned process needs to be automated? My assumption was that you would ever do this once per a new organization.

Assuming a GitHub organization won't come and go in a day or two, I was rather thinking it doesn't need to be automated? 🤔

mumoshu avatar Mar 04 '22 00:03 mumoshu

@mumoshu Hey! 👋 Yeah, so consider the scenario where I'm a large enterprise with hundreds or even thousands of orgs in my GHES platform. We certainly preach "less orgs is better" here, but we definitely have companies with huge numbers of orgs as the developers have been given carte blanche to create them at will. So now we've seen the demo of ARC and decide we want to adopt it to replace our Jenkins platform.

Since GitHub doesn't currently have an API to install apps, that has to be done manually through the UI for each org on which you want to use the ARC. And assuming ARC is somehow enhanced to not require a separate controller for each org, you then need to take that private key, app id, and app installation id and make a k8s secret for each org in the cluster. And you need to maintain a runnerdeployment.yml for each org that references the specific k8s secret to be used. And imagine you've got new orgs being created or deleted throughout the week. There's definitely a need to automate all of that as it would become an administrative nightmare.

It feels like the right answer is for GitHub to support apps at the enterprise level to avoid all of this, and then you just have a single private key, app id, and app installation id for the enterprise. But I don't know that we have this on our roadmap at the moment. So we're stuck with a lot of manual tasks here to support scaling this across many orgs.

casanova-21 avatar Mar 16 '22 19:03 casanova-21

@casanova-21 Hey! #1371 is the PR for the new feature to let ARC handle multiple GitHub App installations across organizations, without requiring one-ARC-per-org. It's planned for ARC 0.25.0 so probably we'll be able to merge it within a month.

To fully resolve our original issue, we'd still need GitHub to provide either (1)a new API to let ARC install the GitHub App onto an org or (2)a new GitHub feature to allow manually installing a GitHub App enterprise-wide though. Do you have any news or info about that?

mumoshu avatar May 25 '22 01:05 mumoshu

@mumoshu Hey 👋 Thank you so much for working towards getting this addressed. I'm starting internal conversations here around this PR. Out of the two items you'll need that you listed, it's far more likely we'll get that API rather than the enterprise-wide GitHub app capability. Let me talk to the Actions team here to see what they think and will get back to you.

casanova-21 avatar May 27 '22 19:05 casanova-21

We've got customers in some cases with hundreds of orgs, and a controller per org will be too much overhead for them

Implementation-wise, one would need to enhance ARC to internally have multiple GitHub API clients, perhaps one per organization or repository to support this use-case. ARC's current design assumes one can call any GitHub APIs required to manage runners with a single GitHub client that shares the same PAT or App creds you've passed to ARC.

Maybe we'd better enhance RunnerDeployment and RunnerSet spec to be able to refer from which K8s secret ARC needs to load GitHub API credentials(either PAT or App's)?

Examples:

kind: RunnerDeployment
spec:
  repository: myorg1/myrepo2
  githubAPICredentialsFrom:
    secretName: pat-has-access-to-myrepo2

---
kind: RunnerDeployment
spec:
  organization: myorg2
  githubAPICredentialsFrom:
    secretName: app-creds-and-installation-id-for-myorg2

---
kind: RunnerDeployment
spec:
  organization: myorg3
  githubAPICredentialsFrom:
    secretName: app-creds-and-installation-id-for-myorg3

---
kind: RunnerDeployment
spec:
  enterprise: myenterprise1
  githubAPICredentialsFrom:
    secretName: pat-of-enterprise-admin-with-admin-priv

Is this supported yet? I'm trying something like this:

kind: RunnerDeployment
metadata:
  name: runner1
spec:
  replicas: 1
  template:
    spec:
      organization: myorg
---
kind: RunnerDeployment
metadata:
  name: runner2
spec:
  replicas: 1
  template:
    spec:
      repository: myorg/my-repo2

RunnerDeployment for the org was created first and works as expected. When trying to deploy the 2nd runner, I get error that failed to create registration token...403 Resource not accessible by integration

Both runners are using same GitHub app credentials.

The GitHub app is available only to repo1 and repo2. What I'm trying to accomplish to having a dedicated runner to each repo, but can't seem to get that to work when configured for the repo, only when using the org.

igitcode avatar Dec 19 '23 20:12 igitcode