actions-runner-controller icon indicating copy to clipboard operation
actions-runner-controller copied to clipboard

Service containers are not working

Open elcidowneador opened this issue 2 years ago • 9 comments

Checks

  • [X] I've already read https://github.com/actions/actions-runner-controller/blob/master/TROUBLESHOOTING.md and I'm sure my issue is not covered in the troubleshooting guide.
  • [X] I'm not using a custom entrypoint in my runner image

Controller Version

0.5.0

Helm Chart Version

0.5.0

CertManager Version

No response

Deployment Method

Helm

cert-manager installation

N/A

Checks

  • [X] This isn't a question or user support case (For Q&A and community support, go to Discussions. It might also be a good idea to contract with any of contributors and maintainers if your business is so critical and therefore you need priority support
  • [X] I've read releasenotes before submitting this issue and I'm sure it's not due to any recently-introduced backward-incompatible changes
  • [X] My actions-runner-controller version (v0.x.y) does support the feature
  • [X] I've already upgraded ARC (including the CRDs, see charts/actions-runner-controller/docs/UPGRADING.md for details) to the latest and it didn't fix the issue
  • [X] I've migrated to the workflow job webhook event (if you using webhook driven scaling)

Resource Definitions

githubConfigUrl: "https://github.com/xxxxx"
        watchSingleNamespace: "gha-runner"
        minRunners: 0
        maxRunners: 50
        githubConfigSecret: gha-app-secret
        runnerScaleSetName: "runner-default-x64"
        containerMode:
          type: "kubernetes"
          kubernetesModeWorkVolumeClaim:
            accessModes: ["ReadWriteOnce"]
            storageClassName: "gp3"
            resources:
              requests:
                storage: 10Gi
                memory: "2Gi"
        controllerServiceAccount:
          namespace: gha-runner-controller
          name: gha-runner-controller
        template:
          spec:
            securityContext:
                fsGroup: 123
            containers:
            - name: runner
              image: ghcr.io/actions/actions-runner:latest
              command: ["/home/runner/run.sh"]
              env:
              - name: ACTIONS_RUNNER_REQUIRE_JOB_CONTAINER
                value: "false"
              resources:
                requests:
                  memory: "4096Mi"
                  cpu: "2"
                limits:
                  memory: "4096Mi"
              volumeMounts:
                - name: work
                  mountPath: /home/runner/_work
            volumes:
              - name: work
                ephemeral:
                  volumeClaimTemplate:
                    spec:
                      accessModes: ["ReadWriteOnce"]
                      storageClassName: "gp3"
                      resources:
                        requests:
                          storage: 10Gi
            tolerations:
              - key: dedicated
                operator: Equal
                value: github-actions-default
            affinity:
              nodeAffinity:
                requiredDuringSchedulingIgnoredDuringExecution:
                  nodeSelectorTerms:
                  - matchExpressions:
                    - key: tier
                      operator: In
                      values:
                      - github-actions-default

To Reproduce

Use something like that in the workflow:

jobs:
  build:
    name: Build
    runs-on: runner-default-x64
    services:
      selenium:
        image: selenium/standalone-chrome

      redis:
        image: redis:4-alpine

Describe the bug

When i try to use service containers i get this error and the pipeline stops due to this error.

Describe the expected behavior

Start service containers to be used within the build phase.

Whole Controller Logs

N/A

Whole Runner Pod Logs

https://gist.github.com/elcidowneador/8064ed64ad7dfcd86580fc3321c5a41b

Additional Context

No response

elcidowneador avatar Sep 01 '23 22:09 elcidowneador

Hello! Thank you for filing an issue.

The maintainers will triage your issue shortly.

In the meantime, please take a look at the troubleshooting guide for bug reports.

If this is a feature request, please review our contribution guidelines.

github-actions[bot] avatar Sep 01 '23 22:09 github-actions[bot]

Have you tried adding a job container image? We do this in many if not all our CI's but we do specify the job container image. To my knowledge the ARC runners behave a little different compared to the github runners in this regard.

genisd avatar Oct 10 '23 15:10 genisd

@genisd agree on that, the error message states that the job needs an image to run. @elcidowneador is this still a problem?

jobs:
  build:
    name: Build
    container:
      image: <<<<add-container-image-here>>>>
    runs-on: runner-default-x64
    services:
      selenium:
        image: selenium/standalone-chrome

      redis:
        image: redis:4-alpine

bastianwegge avatar Feb 16 '24 10:02 bastianwegge

hello! Im trying to use services in the kubernetes mode and I'm getting this error message:

Error: Error: failed to create job pod: Pod "arc-runner-set-kubernetes-mode-drp9c-runner-bdlz5-workflow" is invalid: spec.containers[4].name: Duplicate value: "ubuntu"

This ubuntu image is used in two different service containers in the same job. Do you guys have any idea about what to do? It seems the controller is not using the service container name to name the container in the workflow pod, but it's using the service container image name.

caiocsgomes avatar Mar 04 '24 08:03 caiocsgomes

@caiocsgomes I don't mean to cross any boundaries, but your issue is probably better of by itself rather than in an issue from September last year about service containers not being available at all.

You can create a new issue here.

/edit: And don't forget to post your service definition or anything else more helpful than the error message, because it seems a lot like you might have misconfigured something.

bastianwegge avatar Mar 04 '24 15:03 bastianwegge

yes, I agree. I wondered if this was his problem, but it doesn't seem the case. We are setting up now scalesets, I'll open in case we don't see any viable solution without opening an issue. Thanks!

caiocsgomes avatar Mar 04 '24 15:03 caiocsgomes

Running into this same issue. While specifying a container does "work", it forces you to run the job in a container instead of just using services? Then if I do run it in a container, I'm having issues with my custom CA and caching

Hayden-J-C avatar Mar 16 '24 20:03 Hayden-J-C