Runners are not able to pickup the self hosted cache-server
Hey Louis,
I am trying to setup the cache-server, but the runners are not able to connect to the self hosted cache server.
Cache Server Manifest
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: gha-cache-server
namespace: gha-cache-server
spec:
serviceName: gha-cache-server
replicas: 1
selector:
matchLabels:
app: gha-cache-server
template:
metadata:
labels:
app: gha-cache-server
spec:
containers:
- name: gha-cache-server
image: ghcr.io/falcondev-oss/github-actions-cache-server:latest
ports:
- containerPort: 3000
name: http
volumeMounts:
- name: gcs-credentials
mountPath: /app/.gcs
readOnly: true
- name: cache-data
mountPath: /app/.data
env:
- name: URL_ACCESS_TOKEN
value: "w5AQ9VKtzc"
- name: API_BASE_URL
value: "http://<Load Balancer IP of this app>:3000"
- name: TEMP_DIR
value: "/app/.data/tmp"
- name: STORAGE_DRIVER
value: "gcs"
- name: STORAGE_GCS_BUCKET
value: "gha-cache-server-storage"
- name: STORAGE_GCS_SERVICE_ACCOUNT_KEY
value: "/app/.gcs/sa-gcs-cache-server-keyfile.json"
- name: DEBUG
value: "true"
volumes:
- name: gcs-credentials
secret:
secretName: gha-cache-server-secret
items:
- key: sa-gcs-cache-server-keyfile.json
path: sa-gcs-cache-server-keyfile.json
- name: cache-data
emptyDir: {}
---
apiVersion: v1
kind: Service
metadata:
name: gha-cache-server-lb
namespace: gha-cache-server
labels:
app: gha-cache-server
annotations:
load-balancer.hetzner.cloud/location: hil
spec:
selector:
app: gha-cache-server
ports:
- protocol: TCP
port: 3000
targetPort: 3000
type: LoadBalancer
Runner Image
FROM ghcr.io/actions/actions-runner:2.319.1
# modify actions runner binaries to allow custom cache server implementation
RUN sed -i 's/\x41\x00\x43\x00\x54\x00\x49\x00\x4F\x00\x4E\x00\x53\x00\x5F\x00\x43\x00\x41\x00\x43\x00\x48\x00\x45\x00\x5F\x00\x55\x00\x52\x00\x4C\x00/\x41\x00\x43\x00\x54\x00\x49\x00\x4F\x00\x4E\x00\x53\x00\x5F\x00\x43\x00\x41\x00\x43\x00\x48\x00\x45\x00\x5F\x00\x4F\x00\x52\x00\x4C\x00/g' /home/runner/bin/Runner.Worker.dll
# Switch to root user to install packages
USER root
# Update package lists and install required tools
RUN apt-get update && apt-get install -y \
unzip \
wget \
g++ \
make \
zip \
python3 \
python3-pip \
iproute2 \
net-tools \
pkg-config \
curl \
# Install GitHub CLI
RUN curl -fsSL https://cli.github.com/packages/githubcli-archive-keyring.gpg | gpg --dearmor -o /usr/share/keyrings/githubcli-archive-keyring.gpg && \
echo "deb [arch=$(dpkg --print-architecture) signed-by=/usr/share/keyrings/githubcli-archive-keyring.gpg] https://cli.github.com/packages stable main" | tee /etc/apt/sources.list.d/github-cli.list > /dev/null && \
apt update && \
apt install -y gh
ENV ACTIONS_CACHE_URL="http://<Load Balancer IP of the cache server>:3000/w5AQ9VKtzc/"
# Switch back to the runner user
USER runner
I have deployed the image using ARC runners in a cluster The cache server is deployed in another cluster
Error logs from the cache server
╰─▶▶ k logs gha-cache-server-0 ⎈ | luxorlabs ⇆ gha-cache-server
[cache-server] ℹ Cleaning up cache entries older than 90d with schedule 0 0 * * * (next run: 10/9/2024, 12:00:00 AM)
[cache-server] ℹ 🚀 Starting GitHub Actions Cache Server (v3.1.0)
[cache-server] ℹ Using database driver: sqlite
[cache-server] ℹ Migrating database...
[cache-server] ✔ Database migrated
[cache-server] ℹ Using storage driver: gcs
Listening on http://[::]:3000
[cache-server] ERROR Response: GET /w5AQ9VKtzc/ > 404
Cannot find any route matching /w5AQ9VKtzc/.
at createError$1 (server/chunks/runtime.mjs:1886:15)
at matchHandler (server/chunks/runtime.mjs:3017:16)
at Object.handler (server/chunks/runtime.mjs:3056:19)
at Object.handler (server/chunks/runtime.mjs:2832:31)
at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
at async Server.toNodeHandle (server/chunks/runtime.mjs:3102:7)
[cache-server] ERROR Response: GET /healthz > 404
Cannot find any route matching /healthz.
at createError$1 (server/chunks/runtime.mjs:1886:15)
at matchHandler (server/chunks/runtime.mjs:3017:16)
at Object.handler (server/chunks/runtime.mjs:3056:19)
at Object.handler (server/chunks/runtime.mjs:2832:31)
at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
at async Server.toNodeHandle (server/chunks/runtime.mjs:3102:7)
[cache-server] ERROR Response: GET /w5AQ9VKtzc/caches > 404
Cannot find any route matching /w5AQ9VKtzc/caches.
at createError$1 (server/chunks/runtime.mjs:1886:15)
at matchHandler (server/chunks/runtime.mjs:3017:16)
at Object.handler (server/chunks/runtime.mjs:3056:19)
at Object.handler (server/chunks/runtime.mjs:2832:31)
at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
at async Server.toNodeHandle (server/chunks/runtime.mjs:3102:7)
[cache-server] ERROR Response: GET /w5AQ9VKtzc/ > 404
Cannot find any route matching /w5AQ9VKtzc/.
at createError$1 (server/chunks/runtime.mjs:1886:15)
at matchHandler (server/chunks/runtime.mjs:3017:16)
at Object.handler (server/chunks/runtime.mjs:3056:19)
at Object.handler (server/chunks/runtime.mjs:2832:31)
at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
at async Server.toNodeHandle (server/chunks/runtime.mjs:3102:7)
[cache-server] ERROR Response: GET /w5AQ9VKtzc > 404
Cannot find any route matching /w5AQ9VKtzc.
at createError$1 (server/chunks/runtime.mjs:1886:15)
at matchHandler (server/chunks/runtime.mjs:3017:16)
at Object.handler (server/chunks/runtime.mjs:3056:19)
at Object.handler (server/chunks/runtime.mjs:2832:31)
at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
at async Server.toNodeHandle (server/chunks/runtime.mjs:3102:7)
[cache-server] ERROR Response: GET /w5AQ9VKtzc/caches > 404
Cannot find any route matching /w5AQ9VKtzc/caches.
at createError$1 (server/chunks/runtime.mjs:1886:15)
at matchHandler (server/chunks/runtime.mjs:3017:16)
at Object.handler (server/chunks/runtime.mjs:3056:19)
at Object.handler (server/chunks/runtime.mjs:2832:31)
at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
at async Server.toNodeHandle (server/chunks/runtime.mjs:3102:7)
[h3] Please prefer using `message` for longer error messages instead of `statusMessage`. In the future, `statusMessage` will be sanitized by default.
[cache-server] ERROR Response: GET /<secret_token>/_apis/artifactcache/cache > 400
Invalid query parameters: [
{
"code": "invalid_type",
"expected": "string",
"received": "undefined",
"path": [
"keys"
],
"message": "Required"
},
{
"code": "invalid_type",
"expected": "string",
"received": "undefined",
"path": [
"version"
],
"message": "Required"
}
]
{
"code": "invalid_type",
"expected": "string",
"received": "undefined",
"path": [
"keys"
],
"message": "Required"
},
{
"code": "invalid_type",
"expected": "string",
"received": "undefined",
"path": [
"version"
],
"message": "Required"
}
]
at createError$1 (server/chunks/runtime.mjs:1886:15)
at handler (server/chunks/routes/_token/_apis/artifactcache/cache.get.mjs:35:13)
at _callHandler (server/chunks/runtime.mjs:2712:22)
at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
at async Object.handler (server/chunks/runtime.mjs:2832:19)
at async Server.toNodeHandle (server/chunks/runtime.mjs:3102:7)
[cache-server] ERROR Response: GET /favicon.ico > 404
Cannot find any route matching /favicon.ico.
at createError$1 (server/chunks/runtime.mjs:1886:15)
at matchHandler (server/chunks/runtime.mjs:3017:16)
at Object.handler (server/chunks/runtime.mjs:3056:19)
at Object.handler (server/chunks/runtime.mjs:2832:31)
at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
at async Server.toNodeHandle (server/chunks/runtime.mjs:3102:7)
Could you please help to figure out what's wrong here, my runners are still picking up the default cache servers of GitHub
Are your runners not using the custom cache server at all?
yup, not at all
workflow logs
Prepare all required actions
Getting action download info
Download action repository 'pnpm/action-setup@v3' (SHA:a3[2](https://github.com/LuxorLabs/luxor-apps/actions/runs/11234207084/job/31257371075#step:3:2)52b78c470c02df07e9d59298aecedc[3](https://github.com/LuxorLabs/luxor-apps/actions/runs/11234207084/job/31257371075#step:3:3)ccdd6d)
Download action repository 'actions/setup-node@v4' (SHA:0a44ba7841725637a19e28fa30b79a866c81b0a6)
Download action repository 'actions/cache@v[4](https://github.com/LuxorLabs/luxor-apps/actions/runs/11234207084/job/31257371075#step:3:4)' (SHA:3624ceb22c1c5a301c8db4169662070a689d9ea8)
Run ./.github/actions/ci-setup
Run pnpm/action-setup@v3
Running self-installer...
Installation Completed!
Run actions/setup-node@v4
Attempting to download 20.12.2...
Acquiring 20.12.2 - x64 from https://github.com/actions/node-versions/releases/download/20.12.2-8647736879/node-20.12.2-linux-x64.tar.gz
Extracting ...
/usr/bin/tar xz --strip 1 --warning=no-unknown-keyword --overwrite -C /home/runner/_work/_temp/d1d40f08-c12c-4482-9667-0[5](https://github.com/LuxorLabs/luxor-apps/actions/runs/11234207084/job/31257371075#step:3:5)ec5cf57105 -f /home/runner/_work/_temp/48580388-f434-498d-b687-27ffb37a0e09
Adding to the cache ...
Environment details
Run echo "pnpm_cache_dir=$(pnpm store path)" >> $GITHUB_OUTPUT
Run actions/cache@v4
Cache not found for input keys: Linux-pnpm-store-1b0fe[6](https://github.com/LuxorLabs/luxor-apps/actions/runs/11234207084/job/31257371075#step:3:6)ba7303b5dcca02ffe6d23d852ac628ff3fa4[20](https://github.com/LuxorLabs/luxor-apps/actions/runs/11234207084/job/31257371075#step:3:21)dd42f739e75e43a0586f
Run pnpm install --frozen-lockfile --prefer-offline
Scope: all 58 workspace projects
Lockfile is up to date, resolution step is skipped
Packages: +5361
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Progress: resolved 0, reused 0, downloaded 1, added 0
Progress: resolved 0, reused 0, downloaded 2[51](https://github.com/LuxorLabs/luxor-apps/actions/runs/11234207084/job/31257371075#step:3:54), added 302
Progress: resolved 0, reused 0, downloaded 412, added 478
Progress: resolved 0, reused 0, downloaded 617, added 763
my gcs bucket is empty too
Are you sure the sed command in the runner image is working and the arc is using the image
sed isn't working donno why
but when I manually do export for the env inside the runner pod, the variable gets value
after observing this, I added the ACTIONS_CACHE_URL variable in the image itself still the deployment manifest of the runner also contains this variable
you can see the variable because I added it in the image or else before that I checked in the pod, the variable was empty
╰─▶▶ k exec -it lc-std-ubuntu-4c-16g-c2xh9-runner-p8dqs -- bash ⎈ | gh-runners-backup1 ⇆ arc-runners
Defaulted container "runner" out of: runner, dind, init-dind-externals (init)
To run a command as administrator (user "root"), use "sudo <command>".
See "man sudo_root" for details.
runner@lc-std-ubuntu-4c-16g-c2xh9-runner-p8dqs:~$ echo $ACTIONS_CACHE_URL
http://<IP>:3000/w5AQ9VKtzc/
runner@lc-std-ubuntu-4c-16g-c2xh9-runner-p8dqs:~$
since now the URL is available, the logs look this
╰─▶▶ k logs gha-cache-server-0 ⎈ | luxorlabs ⇆ gha-cache-server
[cache-server] ℹ Cleaning up cache entries older than 90d with schedule 0 0 * * * (next run: 10/9/2024, 12:00:00 AM)
[cache-server] ℹ 🚀 Starting GitHub Actions Cache Server (v3.1.0)
[cache-server] ℹ Using database driver: sqlite
[cache-server] ℹ Migrating database...
[cache-server] ✔ Database migrated
[cache-server] ℹ Using storage driver: gcs
Listening on http://[::]:3000
[cache-server] ERROR Response: GET /w5AQ9VKtzc/ > 404
Cannot find any route matching /w5AQ9VKtzc/.
at createError$1 (server/chunks/runtime.mjs:1886:15)
at matchHandler (server/chunks/runtime.mjs:3017:16)
at Object.handler (server/chunks/runtime.mjs:3056:19)
at Object.handler (server/chunks/runtime.mjs:2832:31)
at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
at async Server.toNodeHandle (server/chunks/runtime.mjs:3102:7)
you can see - the path is wrong - [cache-server] ERROR Response: GET /w5AQ9VKtzc/ > 404
I followed exactly what was there in the doc - ACTIONS_CACHE_URL/random token
Please try running the workflow with debug logs and share workflow and cache server logs
how can I do this - running the workflow with debug logs
Rerun workflow from Github ui and select 'Enable debug logging'
got it, thanks
workflow logs
Run actions/cache@v4
##[debug]Resolved Keys:
##[debug]["Linux-pnpm-store-1b0fe6ba7303b5dcca02ffe6d23d852ac628ff3fa420dd42f739e75e43a0586f"]
##[debug]Checking zstd --quiet --version
##[debug]Unable to locate executable file: zstd. Please verify either the file path exists or the file can be found within a directory specified by the PATH environment variable. Also check the file mode to verify the file is executable.
##[debug]
##[debug]zstd version: null
##[debug]Resource Url: http://5.78.160.71:3000/w5AQ9VKtzc/_apis/artifactcache/cache?keys=Linux-pnpm-store-1b0fe6ba7303b5dcca02ffe6d23d852ac628ff3fa420dd42f739e75e43a0586f&version=a738bf0a0b0b2f5f660068b807d9ac65ab6b46436d4b48c6b10d4f11b5f2b830
##[debug]Resource Url: http://5.78.160.71:3000/w5AQ9VKtzc/_apis/artifactcache/caches?key=Linux-pnpm-store-1b0fe6ba7303b5dcca02ffe6d23d852ac628ff3fa420dd42f739e75e43a0586f
##[debug]Failed to delete archive: Error: ENOENT: no such file or directory, unlink ''
Cache not found for input keys: Linux-pnpm-store-1b0fe6ba7303b5dcca02ffe6d23d852ac628ff3fa420dd42f739e75e43a0586f
##[debug]Node Action run completed with exit code 0
##[debug]Save intra-action state CACHE_KEY = Linux-pnpm-store-1b0fe6ba7303b5dcca02ffe6d23d852ac628ff3fa420dd42f739e75e43a0586f
##[debug]Finished: run
##[debug]Evaluating condition for step: 'run'
##[debug]Evaluating: success()
##[debug]Evaluating success:
##[debug]=> true
##[debug]Result: true
##[debug]Starting: run
##[debug]Loading inputs
##[debug]Loading env
Run pnpm install --frozen-lockfile --prefer-offline
##[debug]/usr/bin/bash --noprofile --norc -e -o pipefail /home/runner/_work/_temp/72aa8493-c895-4e96-b541-6c987f2c3254.sh
Scope: all 58 workspace projects
Lockfile is up to date, resolution step is skipped
Packages: +5361
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
is http://<IP>:3000 reachable by the runner?
yup
runner@lc-std-ubuntu-4c-16g-c2xh9-runner-2chgp:~$ curl http://<IP>:3000
OK
runner@lc-std-ubuntu-4c-16g-c2xh9-runner-2chgp:~$
Try installing zstd in the runner image
workflow logs
Run actions/cache@v4
##[debug]Resolved Keys:
##[debug]["Linux-pnpm-store-1b0fe6ba7303b5dcca02ffe6d23d852ac628ff3fa420dd42f739e75e43a0586f"]
##[debug]Checking zstd --quiet --version
##[debug]1.4.8
##[debug]zstd version: 1.4.8
##[debug]Resource Url: http://<IP>:3000/w5AQ9VKtzc/_apis/artifactcache/cache?keys=Linux-pnpm-store-1b0fe6ba7303b5dcca02ffe6d23d852ac628ff3fa420dd42f739e75e43a0586f&version=d6b3480e[5826](https://github.com/LuxorLabs/luxor-apps/actions/runs/11255273201/job/31296161457#step:3:5833)78f6051d9292cad6d2b11ea6c6d47cfac0e0602ad6e4f421bc9c
##[debug]Resource Url: http://<IP>:3000/w5AQ9VKtzc/_apis/artifactcache/caches?key=Linux-pnpm-store-1b0fe6ba7303b5dcca02ffe6d23d852ac628ff3fa420dd42f739e75e43a0586f
##[debug]Failed to delete archive: Error: ENOENT: no such file or directory, unlink ''
Cache not found for input keys: Linux-pnpm-store-1b0fe6ba7303b5dcca02ffe6d23d852ac628ff3fa420dd42f739e75e43a0586f
##[debug]Node Action run completed with exit code 0
##[debug]Save intra-action state CACHE_KEY = Linux-pnpm-store-1b0fe6ba7303b5dcca02ffe6d23d852ac628ff3fa420dd42f739e75e43a0586f
##[debug]Finished: run
##[debug]Evaluating condition for step: 'run'
##[debug]Evaluating: success()
##[debug]Evaluating success:
##[debug]=> true
##[debug]Result: true
##[debug]Starting: run
##[debug]Loading inputs
##[debug]Loading env
do you need to add the path package? was suggested by many of the stackoverflow answers I wonder how come it worked fine in the case of others whereas showing up errors in my case
Could you share the workflow file?
actions-lab.yaml
jobs:
format-lint-type-check:
name: Format, Lint And Type Check
timeout-minutes: 15
runs-on: lc-std-ubuntu-4c-16g
steps:
- uses: actions/checkout@v4
- uses: ./.github/actions/ci-setup
- name: Format
run: pnpm format:check
- name: Lint
run: pnpm lint --summarize
- uses: ./.github/actions/turbo-summarize
- name: Typescript
run: pnpm type-check --summarize
- uses: ./.github/actions/turbo-summarize
ci-setup/action.yaml
name: "CI setup"
runs:
using: "composite"
steps:
- name: Use PNPM
uses: pnpm/action-setup@v3
with:
version: ${{ env.PNPM_VERSION }}
- name: Setup Node.js
uses: actions/setup-node@v4
with:
node-version: ${{ env.NODE_VERSION }}
- name: Get pnpm store directory
id: pnpm-cache
run: |
echo "pnpm_cache_dir=$(pnpm store path)" >> $GITHUB_OUTPUT
shell: bash
- name: Setup pnpm cache
uses: actions/cache@v4
with:
path: ${{ steps.pnpm-cache.outputs.pnpm_cache_dir }}
key: ${{ runner.os }}-pnpm-store-${{ hashFiles('**/pnpm-lock.yaml') }}
# becareful enabling, fallback might be bad
# restore-keys: |
# ${{ runner.os }}-pnpm-store-
- name: Install dependencies (with cache)
run: pnpm install --frozen-lockfile --prefer-offline
shell: bash
Are there any cache server logs when the workflow runs? The workflow debug logs also look incomplete. They should look something like this:
##[debug]pnpm's cache folder "/home/runner/_work/.pnpm-store/v3" configured for the root directory
##[debug]followSymbolicLinks 'true'
##[debug]followSymbolicLinks 'true'
##[debug]implicitDescendants 'true'
##[debug]matchDirectories 'true'
##[debug]omitBrokenSymbolicLinks 'true'
##[debug]Found 1 files to hash.
##[debug]primary key is node-cache-Linux-pnpm-616c043f3c59d657c3ae0d954524e62320d032e312019b77faa3648e73a45aee
##[debug]Resolved Keys:
##[debug]["node-cache-Linux-pnpm-616c043f3c59d657c3ae0d954524e62320d032e312019b77faa3648e73a45aee"]
##[debug]Checking zstd --quiet --version
##[debug]1.4.8
##[debug]zstd version: 1.4.8
##[debug]Resource Url: http://cache-server.default.svc.cluster.local:3000/FcQoqdUQlu3l80Tb2GgrrpImR/_apis/artifactcache/cache?keys=node-cache-Linux-pnpm-616c043f3c59d657c3ae0d954524e62320d032e312019b77faa3648e73a45aee&version=d6b3480e582678f6051d9292cad6d2b11ea6c6d47cfac0e0602ad6e4f421bc9c
::add-mask::***
##[debug]Cache Result:
##[debug]{"archiveLocation":"***","cacheKey":"node-cache-Linux-pnpm-616c043f3c59d657c3ae0d954524e62320d032e312019b77faa3648e73a45aee"}
##[debug]Archive Path: /home/runner/_work/_temp/4f2aaba6-d36c-465e-bd2b-7abb7517e1e3/cache.tzst
##[debug]Use Azure SDK: false
##[debug]Download concurrency: 8
##[debug]Request timeout (ms): 30000
##[debug]Cache segment download timeout mins env var: undefined
##[debug]Segment download timeout (ms): 600000
##[debug]Lookup only: false
##[debug]Unable to validate download, no Content-Length header
/usr/bin/tar -tf /home/runner/_work/_temp/4f2aaba6-d36c-465e-bd2b-7abb7517e1e3/cache.tzst -P --use-compress-program unzstd
nope, it's not able to connect at all that's why I dont have any logs related to cache-server
I don't see a reason why it wouldn't work. You could try adding a workflow step that just curls the cache server's url to see whether it really is reachable from the runner. Also please upload a full debug workflow run log file of a workflow which uses the cache action.
okay, made a notion doc to share the logs with you in a better way here are the logs and code of all the relevant files - https://luxorlabs.notion.site/Github-Cache-Server-Debugging-b9bdabad62224b0c9fe62675d778e208?pvs=4
let me know if you need anything else
@LouisHaftmann , when you get some time could you please help with this?
does this mean it worked?
just ran it and saw the cache in buckets could see in the logs of server also
check the log2 in the notion page that I have shared
The logs and your screenshot suggest it is working
yup, no idea how it worked I did no changes still I would request you to not to close this issue because the bucket ingress traffic expenses are high, implementing the bucket as a cache won't be worth it so I will try to use local file system with a pvc
will share the issue or doubts here if I dont find any then will let you know, we can close this
hey @LouisHaftmann , I have got some other priorities wont be able to implement the local filestorage with the cache server now please feel to close the ticket I will raise an issue when I feel I need help later
thanks for the help