Adding GO_TAGS: stablediffusion to kubernetes causes pod to restart loop
Spawned off of #315
LocalAI version: 1.14.1
Environment, CPU architecture, OS, and Version: K3s 1.24 Flux https://github.com/bjw-s/helm-charts/tree/main/charts
Describe the bug Helm Release: https://github.com/lenaxia/home-ops-prod/blob/12a676aa0c09742c8426c972e49ea50102d09a5a/cluster/apps/home/localai/app/helm-release.yaml
To Reproduce Helm install, see pod restart loop indefinitely
Expected behavior
Logs https://pastebin.com/3yDrK4pW
Additional context If I can get a reference to a working example I'd be happy to document what I'm missing or what is needed to get stable diffusion working.
Logs looks incomplete here, it ends up in the middle of building one of the backends
Logs looks incomplete here, it ends up in the middle of building one of the backends
This is the entirety of the logs. After this the pod restarts and starts building from scratch. I'll capture the restarted pod output and share it.
I'll take a look at your notes in discord and get back.
Here is the start of one of the restarts, as you can see, it just restarts the build process:
make -C go-llama clean
make[1]: Entering directory '/build/go-llama'
I llama.cpp build info:
I UNAME_S: Linux
I UNAME_P: unknown
I UNAME_M: x86_64
I CFLAGS: -I./llama.cpp -I. -O3 -DNDEBUG -std=c11 -fPIC -Wall -Wextra -Wpedantic -Wcast-qual -Wdouble-promotion -Wshadow -Wstrict-prototypes -Wpointer-arith -Wno-unused-function -pthread -march=native -mtune=native
I CXXFLAGS: -I./llama.cpp -I. -I./llama.cpp/examples -I./examples -O3 -DNDEBUG -std=c++11 -fPIC -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -pthread
I LDFLAGS:
I BUILD_TYPE: openblas
I CMAKE_ARGS: -DLLAMA_BLAS=ON -DLLAMA_BLAS_VENDOR=OpenBLAS -DBLAS_INCLUDE_DIRS=/usr/include/openblas
I EXTRA_TARGETS:
I CC: cc (Debian 10.2.1-6) 10.2.1 20210110
I CXX: g++ (Debian 10.2.1-6) 10.2.1 20210110
rm -rf *.o
rm -rf *.a
make -C llama.cpp clean
make[2]: Entering directory '/build/go-llama/llama.cpp'
I llama.cpp build info:
I UNAME_S: Linux
I UNAME_P: unknown
I UNAME_M: x86_64
I CFLAGS: -I. -O3 -std=c11 -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wdouble-promotion -Wshadow -Wstrict-prototypes -Wpointer-arith -pthread -march=native -mtune=native
I CXXFLAGS: -I. -I./examples -O3 -std=c++11 -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-multichar -pthread -march=native -mtune=native
I LDFLAGS:
I CC: cc (Debian 10.2.1-6) 10.2.1 20210110
I CXX: g++ (Debian 10.2.1-6) 10.2.1 20210110
...
I've also upgraded the initContainer to pre-populate the stablediffusion model files before it starts building and it still doesn't seem to take.
https://github.com/lenaxia/home-ops-prod/blob/ccc3da0ed701f7dd08e370674ac330816d4c74db/cluster/apps/home/localai/app/helm-release.yaml
@mudler can you just provide a full kubectl describe of your deployment/localai or just your raw helm release? That will help me see what else is different.
@lenaxia you may need to configure the default liveness/startup probe to your compilation time: https://github.com/bjw-s/helm-charts/blob/main/charts/library/common/values.yaml#L233 . Disable it temporary to confirm it's the root cause.
Here is the start of one of the restarts, as you can see, it just restarts the build process:
make -C go-llama clean make[1]: Entering directory '/build/go-llama' I llama.cpp build info: I UNAME_S: Linux I UNAME_P: unknown I UNAME_M: x86_64 I CFLAGS: -I./llama.cpp -I. -O3 -DNDEBUG -std=c11 -fPIC -Wall -Wextra -Wpedantic -Wcast-qual -Wdouble-promotion -Wshadow -Wstrict-prototypes -Wpointer-arith -Wno-unused-function -pthread -march=native -mtune=native I CXXFLAGS: -I./llama.cpp -I. -I./llama.cpp/examples -I./examples -O3 -DNDEBUG -std=c++11 -fPIC -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -pthread I LDFLAGS: I BUILD_TYPE: openblas I CMAKE_ARGS: -DLLAMA_BLAS=ON -DLLAMA_BLAS_VENDOR=OpenBLAS -DBLAS_INCLUDE_DIRS=/usr/include/openblas I EXTRA_TARGETS: I CC: cc (Debian 10.2.1-6) 10.2.1 20210110 I CXX: g++ (Debian 10.2.1-6) 10.2.1 20210110 rm -rf *.o rm -rf *.a make -C llama.cpp clean make[2]: Entering directory '/build/go-llama/llama.cpp' I llama.cpp build info: I UNAME_S: Linux I UNAME_P: unknown I UNAME_M: x86_64 I CFLAGS: -I. -O3 -std=c11 -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wdouble-promotion -Wshadow -Wstrict-prototypes -Wpointer-arith -pthread -march=native -mtune=native I CXXFLAGS: -I. -I./examples -O3 -std=c++11 -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-multichar -pthread -march=native -mtune=native I LDFLAGS: I CC: cc (Debian 10.2.1-6) 10.2.1 20210110 I CXX: g++ (Debian 10.2.1-6) 10.2.1 20210110 ...I've also upgraded the initContainer to pre-populate the stablediffusion model files before it starts building and it still doesn't seem to take.
https://github.com/lenaxia/home-ops-prod/blob/ccc3da0ed701f7dd08e370674ac330816d4c74db/cluster/apps/home/localai/app/helm-release.yaml
@mudler can you just provide a full
kubectl describeof yourdeployment/localaior just your rawhelm release? That will help me see what else is different.
apiVersion: v1
kind: Pod
metadata:
creationTimestamp: "2023-05-23T22:55:16Z"
generateName: local-ai-79684c99d5-
labels:
app.kubernetes.io/instance: local-ai
app.kubernetes.io/name: local-ai
pod-template-hash: 79684c99d5
name: local-ai-79684c99d5-bslmr
namespace: default
ownerReferences:
- apiVersion: apps/v1
blockOwnerDeletion: true
controller: true
kind: ReplicaSet
name: local-ai-79684c99d5
uid: 1b6826b8-ad64-40ca-bfa9-a128b986f0a6
resourceVersion: "5285305"
uid: 2f9fe420-f2a7-47cd-87a8-0e63bb530cf7
spec:
containers:
- env:
- name: IMAGE_PATH
value: /tmp
- name: BUILD_TYPE
value: openblas
- name: GO_TAGS
value: stablediffusion
- name: DEBUG
value: "true"
- name: THREADS
value: "8"
- name: CONTEXT_SIZE
value: "1024"
- name: MODELS_PATH
value: /models
image: quay.io/go-skynet/local-ai:master
imagePullPolicy: Always
name: local-ai
resources: {}
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /models
name: models
- mountPath: /var/run/secrets/kubernetes.io/serviceaccount
name: kube-api-access-bhf8t
readOnly: true
dnsPolicy: ClusterFirst
enableServiceLinks: true
initContainers:
- args:
- "url=\"https://gpt4all.io/models/ggml-gpt4all-j.bin\"\nif [[ ! -f \"/models/${url##*/}\"
]]; then\n wget https://gpt4all.io/models/ggml-gpt4all-j.bin -P /models \nfi\n"
command:
- /bin/sh
- -c
image: busybox
imagePullPolicy: Always
name: download-model
resources: {}
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /models
name: models
- mountPath: /var/run/secrets/kubernetes.io/serviceaccount
name: kube-api-access-bhf8t
readOnly: true
preemptionPolicy: PreemptLowerPriority
priority: 0
restartPolicy: Always
schedulerName: default-scheduler
securityContext: {}
serviceAccount: default
serviceAccountName: default
terminationGracePeriodSeconds: 30
tolerations:
- effect: NoExecute
key: node.kubernetes.io/not-ready
operator: Exists
tolerationSeconds: 300
- effect: NoExecute
key: node.kubernetes.io/unreachable
operator: Exists
tolerationSeconds: 300
volumes:
- name: models
persistentVolumeClaim:
claimName: local-ai
- name: kube-api-access-bhf8t
projected:
defaultMode: 420
sources:
- serviceAccountToken:
expirationSeconds: 3607
path: token
- configMap:
items:
- key: ca.crt
path: ca.crt
name: kube-root-ca.crt
- downwardAPI:
items:
- fieldRef:
apiVersion: v1
fieldPath: metadata.namespace
path: namespace
status:
conditions:
- lastProbeTime: null
lastTransitionTime: "2023-05-23T22:55:19Z"
status: "True"
type: Initialized
- lastProbeTime: null
lastTransitionTime: "2023-05-23T22:55:20Z"
status: "True"
type: Ready
- lastProbeTime: null
lastTransitionTime: "2023-05-23T22:55:20Z"
status: "True"
type: ContainersReady
- lastProbeTime: null
lastTransitionTime: "2023-05-23T22:55:16Z"
status: "True"
type: PodScheduled
containerStatuses:
- containerID: containerd://c4176fed1d32bb31036c7a0435a0b22be65be8978952c5399a511cdb8d830bc3
image: quay.io/go-skynet/local-ai:master
imageID: quay.io/go-skynet/local-ai@sha256:ca2243c76ff81201b9d131e5bcd0b13cb54316ad3b87cb5c64e00cfadf627d48
lastState: {}
name: local-ai
ready: true
restartCount: 0
started: true
state:
running:
startedAt: "2023-05-23T22:55:20Z"
initContainerStatuses:
- containerID: containerd://d60b61f8a57c017bfd2ef8dbd2b9c8d5106b55add038573d3e66c3145aad8981
image: docker.io/library/busybox:latest
imageID: docker.io/library/busybox@sha256:560af6915bfc8d7630e50e212e08242d37b63bd5c1ccf9bd4acccf116e262d5b
lastState: {}
name: download-model
ready: true
restartCount: 0
state:
terminated:
containerID: containerd://d60b61f8a57c017bfd2ef8dbd2b9c8d5106b55add038573d3e66c3145aad8981
exitCode: 0
finishedAt: "2023-05-23T22:55:18Z"
reason: Completed
startedAt: "2023-05-23T22:55:18Z"
phase: Running
qosClass: BestEffort
startTime: "2023-05-23T22:55:16Z"
@lenaxia are you using the LocalAI charts? https://github.com/go-skynet/LocalAI#run-localai-in-kubernetes
@sebastien-prudhomme it's always something royally stupid. Yes it was the readiness probes. Thanks, it at least built.
@mudler No, I wasn't using the chart, I'm using the common app-template that the Kubernetes@home has adopted as standard best practice which provides a lot of benefits for free such as extensible volume mounts, and others. Sebastian's pointer to the readiness probes was the key for the build issue, however I'm now hitting a segfault.
I'll start a new issue for that.
For anyone coming across this later, here are two functioning deployments for LocalAI as of 2023.05.24. The first is a helm release using Kubernetes@Home's App-Template for a common base to create an all-in-one release. The second is a direct pod deployment that can be adapted to whatever deployment method you want.
apiVersion: helm.toolkit.fluxcd.io/v2beta1
kind: HelmRelease
metadata:
name: &appname localai
namespace: home
spec:
interval: 20m
chart:
spec:
chart: app-template
version: 1.5.0
interval: 5m
sourceRef:
kind: HelmRepository
name: bjw-s-charts
namespace: flux-system
# See https://github.com/bjw-s/helm-charts/blob/main/charts/library/common/values.yaml
values:
image:
repository: quay.io/go-skynet/local-ai
tag: master
env:
- name: THREADS
value: 14
- name: CONTEXT_SIZE
value: 1024
- name: MODELS_PATH
value: "/models"
- name: IMAGE_PATH
value: /tmp
- name: BUILD_TYPE
value: openblas
- name: GO_TAGS
value: stablediffusion
- name: DEBUG
value: "true"
initContainers:
download-model:
image: busybox@sha256:b5d6fe0712636ceb7430189de28819e195e8966372edfc2d9409d79402a0dc16
command: ["/bin/sh", "-c"]
args:
- |
## A simpler and more secure way if you have a way of staging an archive with the files you need
#wget "https://s3.${SECRET_DEV_DOMAIN}/public/stablediffusion.tar" -P /tmp
#tar -xzvf /tmp/stablediffusion.tar -C $MODELS_PATH
#rm -rf /tmp/stablediffusion.tar
## A more general and less secure way that grab all the files from github
## Details here: https://github.com/go-skynet/LocalAI
## And here: https://github.com/lenaxia/stablediffusion-bins/releases/tag/2023.05.24
mkdir $MODELS_PATH/stablediffusion_assets
wget "https://raw.githubusercontent.com/EdVince/Stable-Diffusion-NCNN/main/x86/linux/assets/AutoencoderKL-256-256-fp16-opt.param" -P $MODELS_PATH/stablediffusion_assets
wget "https://raw.githubusercontent.com/EdVince/Stable-Diffusion-NCNN/main/x86/linux/assets/AutoencoderKL-512-512-fp16-opt.param" -P $MODELS_PATH/stablediffusion_assets
wget "https://raw.githubusercontent.com/EdVince/Stable-Diffusion-NCNN/main/x86/linux/assets/AutoencoderKL-base-fp16.param" -P $MODELS_PATH/stablediffusion_assets
wget "https://raw.githubusercontent.com/EdVince/Stable-Diffusion-NCNN/main/x86/linux/assets/FrozenCLIPEmbedder-fp16.param" -P $MODELS_PATH/stablediffusion_assets
wget "https://raw.githubusercontent.com/EdVince/Stable-Diffusion-NCNN/main/x86/linux/assets/UNetModel-256-256-MHA-fp16-opt.param" -P $MODELS_PATH/stablediffusion_assets
wget "https://raw.githubusercontent.com/EdVince/Stable-Diffusion-NCNN/main/x86/linux/assets/UNetModel-512-512-MHA-fp16-opt.param" -P $MODELS_PATH/stablediffusion_assets
wget "https://raw.githubusercontent.com/EdVince/Stable-Diffusion-NCNN/main/x86/linux/assets/UNetModel-base-MHA-fp16.param" -P $MODELS_PATH/stablediffusion_assets
wget "https://github.com/EdVince/Stable-Diffusion-NCNN/raw/main/x86/linux/assets/log_sigmas.bin" -P $MODELS_PATH/stablediffusion_assets
wget "https://raw.githubusercontent.com/EdVince/Stable-Diffusion-NCNN/main/x86/linux/assets/vocab.txt" -P $MODELS_PATH/stablediffusion_assets
wget "https://github.com/lenaxia/stablediffusion-bins/releases/download/2023.05.24/UNetModel-MHA-fp16.bin" -P $MODELS_PATH/stablediffusion_assets
wget "https://github.com/lenaxia/stablediffusion-bins/releases/download/2023.05.24/FrozenCLIPEmbedder-fp16.bin" -P $MODELS_PATH/stablediffusion_assets
wget "https://github.com/lenaxia/stablediffusion-bins/releases/download/2023.05.24/AutoencoderKL-fp16.bin" -P $MODELS_PATH/stablediffusion_assets
wget "https://github.com/lenaxia/stablediffusion-bins/releases/download/2023.05.24/AutoencoderKL-encoder-512-512-fp16.bin" -P $MODELS_PATH/stablediffusion_assets
cat << "EOF" >> $MODELS_PATH/stablediffusion.yaml
name: stablediffusion
backend: stablediffusion
asset_dir: stablediffusion_assets
EOF
env:
- name: URL
value: "https://gpt4all.io/models/ggml-gpt4all-j.bin"
- name: MODELS_PATH
value: "/models"
volumeMounts:
- name: models
mountPath: /models
securityContext:
runAsUser: 0
persistence:
models:
enabled: true
storageClass: local-path
size: 30Gi
type: pvc
accessMode: ReadWriteOnce
service:
main:
type: LoadBalancer
ports:
http:
port: &port 8080
ingress:
main:
enabled: true
annotations:
hajimari.io/enable: "true"
hajimari.io/icon: eos-icons:ai
hajimari.io/info: Local AI
hajimari.io/group: home
cert-manager.io/cluster-issuer: "letsencrypt-production"
traefik.ingress.kubernetes.io/router.entrypoints: "websecure"
traefik.ingress.kubernetes.io/router.middlewares: networking-chain-authelia@kubernetescrd
hosts:
- host: &uri ai.${SECRET_DEV_DOMAIN}
paths:
- path: /
pathType: Prefix
tls:
- hosts:
- *uri
secretName: *uri
nodeSelector:
node-role.kubernetes.io/worker: "true"
probes:
liveness:
enabled: false
custom: true
spec:
httpGet:
path: /healthz
port: *port
initialDelaySeconds: 0
periodSeconds: 30
timeoutSeconds: 1
failureThreshold: 3
readiness:
enabled: false
custom: true
spec:
httpGet:
path: /readyz
port: *port
initialDelaySeconds: 0
periodSeconds: 30
timeoutSeconds: 1
failureThreshold: 3
startup:
enabled: false
Method 2:
apiVersion: v1
kind: Pod
metadata:
labels:
app.kubernetes.io/instance: local-ai
app.kubernetes.io/name: local-ai
name: local-ai
namespace: default
spec:
containers:
- env:
- name: IMAGE_PATH
value: /tmp
- name: BUILD_TYPE
value: openblas
- name: GO_TAGS
value: stablediffusion
- name: DEBUG
value: "true"
- name: THREADS
value: "8"
- name: CONTEXT_SIZE
value: "1024"
- name: MODELS_PATH
value: /models
image: quay.io/go-skynet/local-ai:master
imagePullPolicy: Always
name: local-ai
volumeMounts:
- mountPath: /models
name: models
initContainers:
- command:
- /bin/sh
- -c
args:
- |
wget "https://s3.thekao.cloud/public/stablediffusion.tar" -P /tmp
tar -xzvf /tmp/stablediffusion.tar -C /models
rm -rf /tmp/stablediffusion.tar
image: busybox
imagePullPolicy: Always
name: download-model
volumeMounts:
- mountPath: /models
name: models
restartPolicy: Always
volumes:
- name: models
persistentVolumeClaim:
claimName: local-ai-data