LocalAI Adding GO_TAGS: stablediffusion to kubernetes causes pod to restart loop

Spawned off of #315

LocalAI version: 1.14.1

Environment, CPU architecture, OS, and Version: K3s 1.24 Flux https://github.com/bjw-s/helm-charts/tree/main/charts

Describe the bug Helm Release: https://github.com/lenaxia/home-ops-prod/blob/12a676aa0c09742c8426c972e49ea50102d09a5a/cluster/apps/home/localai/app/helm-release.yaml

To Reproduce Helm install, see pod restart loop indefinitely

Expected behavior

Logs https://pastebin.com/3yDrK4pW

Additional context If I can get a reference to a working example I'd be happy to document what I'm missing or what is needed to get stable diffusion working.

May 23 '23 05:05 lenaxia

Logs looks incomplete here, it ends up in the middle of building one of the backends

May 23 '23 07:05 mudler

Logs looks incomplete here, it ends up in the middle of building one of the backends

This is the entirety of the logs. After this the pod restarts and starts building from scratch. I'll capture the restarted pod output and share it.

I'll take a look at your notes in discord and get back.

May 23 '23 13:05 lenaxia

Here is the start of one of the restarts, as you can see, it just restarts the build process:

make -C go-llama clean
make[1]: Entering directory '/build/go-llama'
I llama.cpp build info:
I UNAME_S:  Linux
I UNAME_P:  unknown
I UNAME_M:  x86_64
I CFLAGS:   -I./llama.cpp -I. -O3 -DNDEBUG -std=c11 -fPIC -Wall -Wextra -Wpedantic -Wcast-qual -Wdouble-promotion -Wshadow -Wstrict-prototypes -Wpointer-arith -Wno-unused-function -pthread -march=native -mtune=native
I CXXFLAGS: -I./llama.cpp -I. -I./llama.cpp/examples -I./examples -O3 -DNDEBUG -std=c++11 -fPIC -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -pthread
I LDFLAGS:
I BUILD_TYPE:  openblas
I CMAKE_ARGS:  -DLLAMA_BLAS=ON -DLLAMA_BLAS_VENDOR=OpenBLAS -DBLAS_INCLUDE_DIRS=/usr/include/openblas
I EXTRA_TARGETS:
I CC:       cc (Debian 10.2.1-6) 10.2.1 20210110
I CXX:      g++ (Debian 10.2.1-6) 10.2.1 20210110

rm -rf *.o
rm -rf *.a
make -C llama.cpp clean
make[2]: Entering directory '/build/go-llama/llama.cpp'
I llama.cpp build info:
I UNAME_S:  Linux
I UNAME_P:  unknown
I UNAME_M:  x86_64
I CFLAGS:   -I.              -O3 -std=c11   -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wdouble-promotion -Wshadow -Wstrict-prototypes -Wpointer-arith -pthread -march=native -mtune=native
I CXXFLAGS: -I. -I./examples -O3 -std=c++11 -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-multichar -pthread -march=native -mtune=native
I LDFLAGS:
I CC:       cc (Debian 10.2.1-6) 10.2.1 20210110
I CXX:      g++ (Debian 10.2.1-6) 10.2.1 20210110
...

I've also upgraded the initContainer to pre-populate the stablediffusion model files before it starts building and it still doesn't seem to take.

https://github.com/lenaxia/home-ops-prod/blob/ccc3da0ed701f7dd08e370674ac330816d4c74db/cluster/apps/home/localai/app/helm-release.yaml

@mudler can you just provide a full kubectl describe of your deployment/localai or just your raw helm release? That will help me see what else is different.

May 23 '23 22:05 lenaxia

@lenaxia you may need to configure the default liveness/startup probe to your compilation time: https://github.com/bjw-s/helm-charts/blob/main/charts/library/common/values.yaml#L233 . Disable it temporary to confirm it's the root cause.

May 24 '23 19:05 sebastien-prudhomme

Here is the start of one of the restarts, as you can see, it just restarts the build process:

make -C go-llama clean
make[1]: Entering directory '/build/go-llama'
I llama.cpp build info:
I UNAME_S:  Linux
I UNAME_P:  unknown
I UNAME_M:  x86_64
I CFLAGS:   -I./llama.cpp -I. -O3 -DNDEBUG -std=c11 -fPIC -Wall -Wextra -Wpedantic -Wcast-qual -Wdouble-promotion -Wshadow -Wstrict-prototypes -Wpointer-arith -Wno-unused-function -pthread -march=native -mtune=native
I CXXFLAGS: -I./llama.cpp -I. -I./llama.cpp/examples -I./examples -O3 -DNDEBUG -std=c++11 -fPIC -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -pthread
I LDFLAGS:
I BUILD_TYPE:  openblas
I CMAKE_ARGS:  -DLLAMA_BLAS=ON -DLLAMA_BLAS_VENDOR=OpenBLAS -DBLAS_INCLUDE_DIRS=/usr/include/openblas
I EXTRA_TARGETS:
I CC:       cc (Debian 10.2.1-6) 10.2.1 20210110
I CXX:      g++ (Debian 10.2.1-6) 10.2.1 20210110

rm -rf *.o
rm -rf *.a
make -C llama.cpp clean
make[2]: Entering directory '/build/go-llama/llama.cpp'
I llama.cpp build info:
I UNAME_S:  Linux
I UNAME_P:  unknown
I UNAME_M:  x86_64
I CFLAGS:   -I.              -O3 -std=c11   -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wdouble-promotion -Wshadow -Wstrict-prototypes -Wpointer-arith -pthread -march=native -mtune=native
I CXXFLAGS: -I. -I./examples -O3 -std=c++11 -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-multichar -pthread -march=native -mtune=native
I LDFLAGS:
I CC:       cc (Debian 10.2.1-6) 10.2.1 20210110
I CXX:      g++ (Debian 10.2.1-6) 10.2.1 20210110
...

I've also upgraded the initContainer to pre-populate the stablediffusion model files before it starts building and it still doesn't seem to take.

https://github.com/lenaxia/home-ops-prod/blob/ccc3da0ed701f7dd08e370674ac330816d4c74db/cluster/apps/home/localai/app/helm-release.yaml

@mudler can you just provide a full kubectl describe of your deployment/localai or just your raw helm release? That will help me see what else is different.

apiVersion: v1
kind: Pod
metadata:
  creationTimestamp: "2023-05-23T22:55:16Z"
  generateName: local-ai-79684c99d5-
  labels:
    app.kubernetes.io/instance: local-ai
    app.kubernetes.io/name: local-ai
    pod-template-hash: 79684c99d5
  name: local-ai-79684c99d5-bslmr
  namespace: default
  ownerReferences:
  - apiVersion: apps/v1
    blockOwnerDeletion: true
    controller: true
    kind: ReplicaSet
    name: local-ai-79684c99d5
    uid: 1b6826b8-ad64-40ca-bfa9-a128b986f0a6
  resourceVersion: "5285305"
  uid: 2f9fe420-f2a7-47cd-87a8-0e63bb530cf7
spec:
  containers:
  - env:
    - name: IMAGE_PATH
      value: /tmp
    - name: BUILD_TYPE
      value: openblas
    - name: GO_TAGS
      value: stablediffusion
    - name: DEBUG
      value: "true"
    - name: THREADS
      value: "8"
    - name: CONTEXT_SIZE
      value: "1024"
    - name: MODELS_PATH
      value: /models
    image: quay.io/go-skynet/local-ai:master
    imagePullPolicy: Always
    name: local-ai
    resources: {}
    terminationMessagePath: /dev/termination-log
    terminationMessagePolicy: File
    volumeMounts:
    - mountPath: /models
      name: models
    - mountPath: /var/run/secrets/kubernetes.io/serviceaccount
      name: kube-api-access-bhf8t
      readOnly: true
  dnsPolicy: ClusterFirst
  enableServiceLinks: true
  initContainers:
  - args:
    - "url=\"https://gpt4all.io/models/ggml-gpt4all-j.bin\"\nif [[ ! -f \"/models/${url##*/}\"
      ]]; then\n  wget https://gpt4all.io/models/ggml-gpt4all-j.bin -P /models \nfi\n"
    command:
    - /bin/sh
    - -c
    image: busybox
    imagePullPolicy: Always
    name: download-model
    resources: {}
    terminationMessagePath: /dev/termination-log
    terminationMessagePolicy: File
    volumeMounts:
    - mountPath: /models
      name: models
    - mountPath: /var/run/secrets/kubernetes.io/serviceaccount
      name: kube-api-access-bhf8t
      readOnly: true
  preemptionPolicy: PreemptLowerPriority
  priority: 0
  restartPolicy: Always
  schedulerName: default-scheduler
  securityContext: {}
  serviceAccount: default
  serviceAccountName: default
  terminationGracePeriodSeconds: 30
  tolerations:
  - effect: NoExecute
    key: node.kubernetes.io/not-ready
    operator: Exists
    tolerationSeconds: 300
  - effect: NoExecute
    key: node.kubernetes.io/unreachable
    operator: Exists
    tolerationSeconds: 300
  volumes:
  - name: models
    persistentVolumeClaim:
      claimName: local-ai
  - name: kube-api-access-bhf8t
    projected:
      defaultMode: 420
      sources:
      - serviceAccountToken:
          expirationSeconds: 3607
          path: token
      - configMap:
          items:
          - key: ca.crt
            path: ca.crt
          name: kube-root-ca.crt
      - downwardAPI:
          items:
          - fieldRef:
              apiVersion: v1
              fieldPath: metadata.namespace
            path: namespace
status:
  conditions:
  - lastProbeTime: null
    lastTransitionTime: "2023-05-23T22:55:19Z"
    status: "True"
    type: Initialized
  - lastProbeTime: null
    lastTransitionTime: "2023-05-23T22:55:20Z"
    status: "True"
    type: Ready
  - lastProbeTime: null
    lastTransitionTime: "2023-05-23T22:55:20Z"
    status: "True"
    type: ContainersReady
  - lastProbeTime: null
    lastTransitionTime: "2023-05-23T22:55:16Z"
    status: "True"
    type: PodScheduled
  containerStatuses:
  - containerID: containerd://c4176fed1d32bb31036c7a0435a0b22be65be8978952c5399a511cdb8d830bc3
    image: quay.io/go-skynet/local-ai:master
    imageID: quay.io/go-skynet/local-ai@sha256:ca2243c76ff81201b9d131e5bcd0b13cb54316ad3b87cb5c64e00cfadf627d48
    lastState: {}
    name: local-ai
    ready: true
    restartCount: 0
    started: true
    state:
      running:
        startedAt: "2023-05-23T22:55:20Z"

  initContainerStatuses:
  - containerID: containerd://d60b61f8a57c017bfd2ef8dbd2b9c8d5106b55add038573d3e66c3145aad8981
    image: docker.io/library/busybox:latest
    imageID: docker.io/library/busybox@sha256:560af6915bfc8d7630e50e212e08242d37b63bd5c1ccf9bd4acccf116e262d5b
    lastState: {}
    name: download-model
    ready: true
    restartCount: 0
    state:
      terminated:
        containerID: containerd://d60b61f8a57c017bfd2ef8dbd2b9c8d5106b55add038573d3e66c3145aad8981
        exitCode: 0
        finishedAt: "2023-05-23T22:55:18Z"
        reason: Completed
        startedAt: "2023-05-23T22:55:18Z"
  phase: Running
  qosClass: BestEffort
  startTime: "2023-05-23T22:55:16Z"

May 24 '23 20:05 mudler

@lenaxia are you using the LocalAI charts? https://github.com/go-skynet/LocalAI#run-localai-in-kubernetes

May 24 '23 20:05 mudler

@sebastien-prudhomme it's always something royally stupid. Yes it was the readiness probes. Thanks, it at least built.

@mudler No, I wasn't using the chart, I'm using the common app-template that the Kubernetes@home has adopted as standard best practice which provides a lot of benefits for free such as extensible volume mounts, and others. Sebastian's pointer to the readiness probes was the key for the build issue, however I'm now hitting a segfault.

I'll start a new issue for that.

May 24 '23 21:05 lenaxia

For anyone coming across this later, here are two functioning deployments for LocalAI as of 2023.05.24. The first is a helm release using Kubernetes@Home's App-Template for a common base to create an all-in-one release. The second is a direct pod deployment that can be adapted to whatever deployment method you want.

apiVersion: helm.toolkit.fluxcd.io/v2beta1
kind: HelmRelease
metadata:
  name: &appname localai
  namespace: home
spec:
  interval: 20m
  chart:
    spec:
      chart: app-template
      version: 1.5.0
      interval: 5m
      sourceRef:
        kind: HelmRepository
        name: bjw-s-charts
        namespace: flux-system
  # See https://github.com/bjw-s/helm-charts/blob/main/charts/library/common/values.yaml
  values:

    image: 
      repository: quay.io/go-skynet/local-ai
      tag: master

    env:
    - name: THREADS
      value: 14
    - name: CONTEXT_SIZE
      value: 1024
    - name: MODELS_PATH
      value: "/models"
    - name: IMAGE_PATH
      value: /tmp
    - name: BUILD_TYPE
      value: openblas
    - name: GO_TAGS
      value: stablediffusion
    - name: DEBUG
      value: "true"

    initContainers:
      download-model:
        image: busybox@sha256:b5d6fe0712636ceb7430189de28819e195e8966372edfc2d9409d79402a0dc16
        command: ["/bin/sh", "-c"]
        args:
          - |
            ## A simpler and more secure way if you have a way of staging an archive with the files you need
            #wget "https://s3.${SECRET_DEV_DOMAIN}/public/stablediffusion.tar" -P /tmp
            #tar -xzvf /tmp/stablediffusion.tar -C $MODELS_PATH
            #rm -rf /tmp/stablediffusion.tar

            ## A more general and less secure way that grab all the files from github
            ## Details here: https://github.com/go-skynet/LocalAI
            ## And here: https://github.com/lenaxia/stablediffusion-bins/releases/tag/2023.05.24
            mkdir $MODELS_PATH/stablediffusion_assets
            wget "https://raw.githubusercontent.com/EdVince/Stable-Diffusion-NCNN/main/x86/linux/assets/AutoencoderKL-256-256-fp16-opt.param" -P $MODELS_PATH/stablediffusion_assets
            wget "https://raw.githubusercontent.com/EdVince/Stable-Diffusion-NCNN/main/x86/linux/assets/AutoencoderKL-512-512-fp16-opt.param" -P $MODELS_PATH/stablediffusion_assets
            wget "https://raw.githubusercontent.com/EdVince/Stable-Diffusion-NCNN/main/x86/linux/assets/AutoencoderKL-base-fp16.param" -P $MODELS_PATH/stablediffusion_assets
            wget "https://raw.githubusercontent.com/EdVince/Stable-Diffusion-NCNN/main/x86/linux/assets/FrozenCLIPEmbedder-fp16.param" -P $MODELS_PATH/stablediffusion_assets
            wget "https://raw.githubusercontent.com/EdVince/Stable-Diffusion-NCNN/main/x86/linux/assets/UNetModel-256-256-MHA-fp16-opt.param" -P $MODELS_PATH/stablediffusion_assets
            wget "https://raw.githubusercontent.com/EdVince/Stable-Diffusion-NCNN/main/x86/linux/assets/UNetModel-512-512-MHA-fp16-opt.param" -P $MODELS_PATH/stablediffusion_assets
            wget "https://raw.githubusercontent.com/EdVince/Stable-Diffusion-NCNN/main/x86/linux/assets/UNetModel-base-MHA-fp16.param" -P $MODELS_PATH/stablediffusion_assets
            wget "https://github.com/EdVince/Stable-Diffusion-NCNN/raw/main/x86/linux/assets/log_sigmas.bin" -P $MODELS_PATH/stablediffusion_assets
            wget "https://raw.githubusercontent.com/EdVince/Stable-Diffusion-NCNN/main/x86/linux/assets/vocab.txt" -P $MODELS_PATH/stablediffusion_assets
            wget "https://github.com/lenaxia/stablediffusion-bins/releases/download/2023.05.24/UNetModel-MHA-fp16.bin" -P $MODELS_PATH/stablediffusion_assets
            wget "https://github.com/lenaxia/stablediffusion-bins/releases/download/2023.05.24/FrozenCLIPEmbedder-fp16.bin" -P $MODELS_PATH/stablediffusion_assets
            wget "https://github.com/lenaxia/stablediffusion-bins/releases/download/2023.05.24/AutoencoderKL-fp16.bin" -P $MODELS_PATH/stablediffusion_assets
            wget "https://github.com/lenaxia/stablediffusion-bins/releases/download/2023.05.24/AutoencoderKL-encoder-512-512-fp16.bin" -P $MODELS_PATH/stablediffusion_assets
        
            cat << "EOF" >> $MODELS_PATH/stablediffusion.yaml
            name: stablediffusion
            backend: stablediffusion
            asset_dir: stablediffusion_assets
            EOF

        env:
          - name: URL
            value: "https://gpt4all.io/models/ggml-gpt4all-j.bin"
          - name: MODELS_PATH
            value: "/models"
        volumeMounts:
          - name: models 
            mountPath: /models
        securityContext:
          runAsUser: 0

    persistence:
      models:
        enabled: true
        storageClass: local-path
        size: 30Gi
        type: pvc
        accessMode: ReadWriteOnce

    service:
      main:
        type: LoadBalancer
        ports:
          http:
            port: &port 8080

    ingress:
      main:
        enabled: true
        annotations:
          hajimari.io/enable: "true"
          hajimari.io/icon: eos-icons:ai
          hajimari.io/info: Local AI
          hajimari.io/group: home
          cert-manager.io/cluster-issuer: "letsencrypt-production"
          traefik.ingress.kubernetes.io/router.entrypoints: "websecure"
          traefik.ingress.kubernetes.io/router.middlewares: networking-chain-authelia@kubernetescrd
        hosts:
        - host: &uri ai.${SECRET_DEV_DOMAIN}
          paths:
          - path: /
            pathType: Prefix
        tls:
        - hosts:
            - *uri
          secretName: *uri
    
    nodeSelector:
      node-role.kubernetes.io/worker: "true"

    probes:
      liveness: 
        enabled: false
        custom: true
        spec:
          httpGet:
            path: /healthz
            port: *port
          initialDelaySeconds: 0
          periodSeconds: 30 
          timeoutSeconds: 1
          failureThreshold: 3
      readiness: 
        enabled: false
        custom: true
        spec:
          httpGet:
            path: /readyz
            port: *port
          initialDelaySeconds: 0
          periodSeconds: 30 
          timeoutSeconds: 1
          failureThreshold: 3
      startup:
        enabled: false

Method 2:

apiVersion: v1
kind: Pod
metadata:
  labels:
    app.kubernetes.io/instance: local-ai
    app.kubernetes.io/name: local-ai
  name: local-ai
  namespace: default
spec:
  containers:
  - env:
    - name: IMAGE_PATH
      value: /tmp
    - name: BUILD_TYPE
      value: openblas
    - name: GO_TAGS
      value: stablediffusion
    - name: DEBUG
      value: "true"
    - name: THREADS
      value: "8"
    - name: CONTEXT_SIZE
      value: "1024"
    - name: MODELS_PATH
      value: /models
    image: quay.io/go-skynet/local-ai:master
    imagePullPolicy: Always
    name: local-ai
    volumeMounts:
    - mountPath: /models
      name: models
  initContainers:
  - command:
    - /bin/sh
    - -c
    args:
      - |
        wget "https://s3.thekao.cloud/public/stablediffusion.tar" -P /tmp
        tar -xzvf /tmp/stablediffusion.tar -C /models
        rm -rf /tmp/stablediffusion.tar
    image: busybox
    imagePullPolicy: Always
    name: download-model
    volumeMounts:
    - mountPath: /models
      name: models
  restartPolicy: Always
  volumes:
  - name: models
    persistentVolumeClaim:
      claimName: local-ai-data

May 25 '23 07:05 lenaxia