Node.js Executable Not Found and running When Using gcr.io/kaniko-project/executor:debug Base Image
Checks
- [X] I've already read https://docs.github.com/en/actions/hosting-your-own-runners/managing-self-hosted-runners-with-actions-runner-controller/troubleshooting-actions-runner-controller-errors and I'm sure my issue is not covered in the troubleshooting guide.
- [X] I am using charts that are officially provided
Controller Version
0.9.3
Deployment Method
Helm
Checks
- [X] This isn't a question or user support case (For Q&A and community support, go to Discussions).
- [X] I've read the Changelog before submitting this issue and I'm sure it's not due to any recently-introduced backward-incompatible changes
To Reproduce
1. Deploy the `gha-runner-scale-set` with type kubernetes mode enabled.
Describe the bug
I'm currently running GitHub Actions runners with mode:kubernetes enabled, using two different base containers as the image: ubuntu:latest and gcr.io/kaniko-project/executor:debug.
When I use gcr.io/kaniko-project/executor:debug as the base image, the Node.js executable is not found at the specified path, causing the workflow to fail immediately with the following error:
Upon inspecting the container, Node.js is present but not in an executable form, which causes the container to fail with the below error:
sh: /__e/node20/bin/node: not found
When I exec into the pod using the gcr.io/kaniko-project/executor:debug image:
/__e # ls
node16 node20
/__e #
/__e # cd node20/
/__e/node20 # ls
CHANGELOG.md LICENSE README.md bin include lib share
/__e/node20 #
/__e/node20 # cd bin/
/__e/node20/bin # ls
corepack node npm npx
/__e/node20/bin #
/__e/node20/bin # pwd
/__e/node20/bin
/__e/node20/bin # /__e/node20/bin/node
sh: /__e/node20/bin/node: not found
/__e/node20/bin #
/__e/node20/bin #
/__e/node20/bin #
/__e/node20/bin # /__e/node20/bin/node
sh: /__e/node20/bin/node: not found
sh: /__e/node20/bin/node: not found
Workflow File Using gcr.io/kaniko-project/executor:debug:
name: Build and Deploy
on:
workflow_call:
inputs:
branch:
required: true
type: string
default: 'main'
build_registry:
required: true
type: string
jobs:
arm-build:
runs-on: [test-runners]
container:
image: gcr.io/kaniko-project/executor:debug
permissions:
contents: read
packages: write
steps:
- uses: actions/checkout@v3
with:
fetch-depth: 0
Working Example with ubuntu:latest
When using the ubuntu:latest image, the Node.js executable is present and the workflow runs as expected.
name: Build and Deploy
on:
workflow_call:
inputs:
branch:
required: true
type: string
default: 'main'
build_registry:
required: true
type: string
jobs:
arm-build:
runs-on: [test-runners]
container:
image: ubuntu:latest
permissions:
contents: read
packages: write
steps:
- uses: actions/checkout@v3
with:
fetch-depth: 0
Node.js Executable Found:
root@test-runners-px7pq-runner-2wsv4-workflow:/__e/node20/bin# ls
corepack node npm npx
root@test-runners-px7pq-runner-2wsv4-workflow:/__e/node20/bin#
root@test-runners-px7pq-runner-2wsv4-workflow:/__e/node20/bin#
root@test-runners-px7pq-runner-2wsv4-workflow:/__e/node20/bin# pwd
/__e/node20/bin
root@test-runners-px7pq-runner-2wsv4-workflow:/__e/node20/bin# /__e/node20/bin/node
Welcome to Node.js v20.13.1.
Type ".help" for more information.
>
This issue is a major blocker as it prevents us from using the customized image based on our requirements. We need to run the containers using our predefined image with custom images and packages installed.
I need help to resolve this issue so that the Node.js executable can be found and used correctly when using the gcr.io/kaniko-project/executor:debug image. Any insights or solutions would be greatly appreciated.
Describe the expected behavior
The workflow should run successfully and Node.js packages should be installed correctly, even when using a customized image such as gcr.io/kaniko-project/executor:debug. The Node.js executable should be found and functional, ensuring that all steps in the workflow proceed without errors.
Additional Context
## githubConfigUrl is the GitHub url for where you want to configure runners
## ex: https://github.com/myorg/myrepo or https://github.com/myorg
githubConfigUrl: "https://github.com/"
## githubConfigSecret is the k8s secrets to use when auth with GitHub API.
## You can choose to use GitHub App or a PAT token
githubConfigSecret:
### GitHub Apps Configuration
## NOTE: IDs MUST be strings, use quotes
#github_app_id: ""
#github_app_installation_id: ""
#github_app_private_key: |
### GitHub PAT Configuration
# github_token: ""
## If you have a pre-define Kubernetes secret in the same namespace the gha-runner-scale-set is going to deploy,
## you can also reference it via `githubConfigSecret: pre-defined-secret`.
## You need to make sure your predefined secret has all the required secret data set properly.
## For a pre-defined secret using GitHub PAT, the secret needs to be created like this:
## > kubectl create secret generic pre-defined-secret --namespace=my_namespace --from-literal=github_token='ghp_your_pat'
## For a pre-defined secret using GitHub App, the secret needs to be created like this:
## > kubectl create secret generic pre-defined-secret --namespace=my_namespace --from-literal=github_app_id=123456 --from-literal=github_app_installation_id=654321 --from-literal=github_app_private_key='-----BEGIN CERTIFICATE-----*******'
githubConfigSecret: github-token
## proxy can be used to define proxy settings that will be used by the
## controller, the listener and the runner of this scale set.
#
# proxy:
# http:
# url: http://proxy.com:1234
# credentialSecretRef: proxy-auth # a secret with `username` and `password` keys
# https:
# url: http://proxy.com:1234
# credentialSecretRef: proxy-auth # a secret with `username` and `password` keys
# noProxy:
# - example.com
# - example.org
# maxRunners is the max number of runners the autoscaling runner set will scale up to.
# maxRunners: 5
# minRunners is the min number of idle runners. The target number of runners created will be
# calculated as a sum of minRunners and the number of jobs assigned to the scale set.
minRunners: 2
runnerGroup: "test-runners"
# ## name of the runner scale set to create. Defaults to the helm release name
runnerScaleSetName: "test-runners"
## A self-signed CA certificate for communication with the GitHub server can be
## provided using a config map key selector. If `runnerMountPath` is set, for
## each runner pod ARC will:
## - create a `github-server-tls-cert` volume containing the certificate
## specified in `certificateFrom`
## - mount that volume on path `runnerMountPath`/{certificate name}
## - set NODE_EXTRA_CA_CERTS environment variable to that same path
## - set RUNNER_UPDATE_CA_CERTS environment variable to "1" (as of version
## 2.303.0 this will instruct the runner to reload certificates on the host)
##
## If any of the above had already been set by the user in the runner pod
## template, ARC will observe those and not overwrite them.
## Example configuration:
#
# githubServerTLS:
# certificateFrom:
# configMapKeyRef:
# name: config-map-name
# key: ca.crt
# runnerMountPath: /usr/local/share/ca-certificates/
## Container mode is an object that provides out-of-box configuration
## for dind and kubernetes mode. Template will be modified as documented under the
## template object.
##
## If any customization is required for dind or kubernetes mode, containerMode should remain
## empty, and configuration should be applied to the template.
containerMode:
type: "kubernetes" ## type can be set to dind or kubernetes
## the following is required when containerMode.type=kubernetes
kubernetesModeWorkVolumeClaim:
accessModes: ["ReadWriteOnce"]
# For local testing, use https://github.com/openebs/dynamic-localpv-provisioner/blob/develop/docs/quickstart.md to provide dynamic provision volume with storageClassName: openebs-hostpath
storageClassName: "gp3"
resources:
requests:
storage: 5Gi
# kubernetesModeServiceAccount:
# annotations:
## listenerTemplate is the PodSpec for each listener Pod
## For reference: https://kubernetes.io/docs/reference/kubernetes-api/workload-resources/pod-v1/#PodSpec
listenerTemplate:
spec:
nodeSelector:
purpose: github-actions
tolerations:
- key: purpose
operator: Equal
value: github-actions
effect: NoSchedule
containers:
# Use this section to append additional configuration to the listener container.
# If you change the name of the container, the configuration will not be applied to the listener,
# and it will be treated as a side-car container.
- name: listener
resources:
limits:
cpu: "500m"
memory: "500Mi"
requests:
cpu: "250m"
memory: "250Mi"
# securityContext:
# runAsUser: 1000
# # Use this section to add the configuration of a side-car container.
# # Comment it out or remove it if you don't need it.
# # Spec for this container will be applied as is without any modifications.
# - name: side-car
# image: example-sidecar
## template is the PodSpec for each runner Pod
## For reference: https://kubernetes.io/docs/reference/kubernetes-api/workload-resources/pod-v1/#PodSpec
template:
template:
spec:
containers:
- name: runner
image: ghcr.io/actions/actions-runner:latest
command: ["/home/runner/run.sh"]
env:
- name: ACTIONS_RUNNER_CONTAINER_HOOKS
value: /home/runner/k8s/index.js
- name: ACTIONS_RUNNER_CONTAINER_HOOK_TEMPLATE
value: /etc/config/runner-template.yaml
- name: ACTIONS_RUNNER_POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: ACTIONS_RUNNER_REQUIRE_JOB_CONTAINER
value: "true"
volumeMounts:
- name: work
mountPath: /home/runner/_work
- mountPath: /etc/config
name: hook-template
volumes:
- name: hook-template
configMap:
name: runner-config
- name: work
ephemeral:
volumeClaimTemplate:
spec:
accessModes: [ "ReadWriteOnce" ]
storageClassName: "local-path"
resources:
requests:
storage: 1Gi
spec:
securityContext:
fsGroup: 1001
containers:
- name: runner
image: ghcr.io/actions/actions-runner:latest
command: ["/home/runner/run.sh"]
env:
- name: ACTIONS_RUNNER_REQUIRE_JOB_CONTAINER
value: "false"
nodeSelector:
purpose: github-actions-arm
tolerations:
- key: purpose
operator: Equal
value: github-actions-arm
effect: NoSchedule
## Optional controller service account that needs to have required Role and RoleBinding
## to operate this gha-runner-scale-set installation.
## The helm chart will try to find the controller deployment and its service account at installation time.
## In case the helm chart can't find the right service account, you can explicitly pass in the following value
## to help it finish RoleBinding with the right service account.
## Note: if your controller is installed to only watch a single namespace, you have to pass these values explicitly.
# controllerServiceAccount:
# namespace: arc-system
# name: test-arc-gha-runner-scale-set-controller
Controller Logs
https://gist.github.com/kanakaraju17/4f58c0b332451ef6fab345a8078a6b3b
Runner Pod Logs
https://gist.github.com/kanakaraju17/c61f8da3038741634acea68f40c12afc
also interested in this since we want to get secrets to repository from hashicorp/vault-action
I ran into this exact same issue using Kaniko and figured out a solution. When running NodeJS based actions such as actions/checkout, they're ran using Node.JS 20 that is externally mounted into the workflow container into /__e/node20/bin/node. This Node.JS executable isn't statically linked, so it dynamically links some libraries from the host its running at, namely all these:
$ ldd externals/node20/bin/node
linux-vdso.so.1 (0x000075316a1e9000)
libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x000075316a1db000)
libstdc++.so.6 => /lib/x86_64-linux-gnu/libstdc++.so.6 (0x0000753169faf000)
libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x0000753169ec8000)
libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x0000753169ea8000)
libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x0000753169ea3000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x0000753169c78000)
/lib64/ld-linux-x86-64.so.2 (0x000075316a1eb000)
The problem lies in how the Kaniko container image is built; it uses the empty base scratch and copies the Kaniko executable, some tools and - if using the debug container - busybox into the image. Therefore the Kaniko container image doesn't contain any usual libraries. This is fine since Kaniko is built using Go, and is therefore entirely statically linked. But Node.JS won't work as-is because of the missing libraries, which causes the not-very-descriptive "not found" error. My solution is to create my own custom Kaniko container image that includes the libraries from a relatively recent distro that supports Node.JS 20, I used Debian bookworm:
FROM debian:bookworm-slim AS debian
# containerd/ARC attempt to run shell stuff inside the container, use the debug image since it
# contains busybox + utilities
FROM gcr.io/kaniko-project/executor:debug
# lie about the container being Debian to make some ARC stuff behave nicely
COPY --from=debian /etc/os-release /etc/os-release
# ARC runs nodejs actions on the workflow container by mounting node to the container. nodejs is
# dynamically linked and the Kaniko container doesn't contain any supporting libraries for node to
# run, so copy required libraries to the container
COPY --from=debian /lib/x86_64-linux-gnu/libdl.so.2 /lib/x86_64-linux-gnu/libdl.so.2
COPY --from=debian /lib/x86_64-linux-gnu/libstdc++.so.6 /lib/x86_64-linux-gnu/libstdc++.so.6
COPY --from=debian /lib/x86_64-linux-gnu/libm.so.6 /lib/x86_64-linux-gnu/libm.so.6
COPY --from=debian /lib/x86_64-linux-gnu/libgcc_s.so.1 /lib/x86_64-linux-gnu/libgcc_s.so.1
COPY --from=debian /lib/x86_64-linux-gnu/libpthread.so.0 /lib/x86_64-linux-gnu/libpthread.so.0
COPY --from=debian /lib/x86_64-linux-gnu/libc.so.6 /lib/x86_64-linux-gnu/libc.so.6
COPY --from=debian /lib64/ld-linux-x86-64.so.2 /lib64/ld-linux-x86-64.so.2
WORKDIR /workspace
ENTRYPOINT ["/kaniko/executor"]
Closing this one since it is not related to ARC Thank you, @Spanfile, for answering this issue!