skaffold icon indicating copy to clipboard operation
skaffold copied to clipboard

Skaffold running out of memory if there are many configurations and there is artifact conflict

Open dmavrommatis opened this issue 1 year ago • 0 comments

I have a monorepo with a central skaffold.yaml file that includes all other skaffold.yaml files with requires. At some point someone in the configuration included an artifact that is duplicate of another. Normally skaffold throws an error like this

source: /tmp/skaffold/skaffold.yaml, in module "other": source: /tmp/skaffold/other2.yaml, in module "postgresql2": source: /tmp/skaffold/other.yaml, in module "postgresql": source: /tmp/skaffold/skaffold.yaml, in module "traefik": duplicate image "postgresql-ci" found in sources /tmp/skaffold/other.yaml and /tmp/skaffold/other2.yaml: artifact image names must be unique across all configurations
source: /tmp/skaffold/skaffold.yaml, in module "other" on line 9 column 14: source: /tmp/skaffold/other2.yaml, in module "postgresql2" on line 9 column 14: source: /tmp/skaffold/other.yaml, in module "postgresql" on line 9 column 14: source: /tmp/skaffold/skaffold.yaml, in module "traefik" on line 9 column 14: duplicate image "postgresql-ci" found in sources /tmp/skaffold/other.yaml and /tmp/skaffold/other2.yaml: artifact image names must be unique across all configurations

In my case; trying to deploy traefik service with skaffold deploy -m traefik, hungs at Helm release traefik not installed. Installing... and then kills my laptop as it runs out of memory. Scaling down the configuration to only require two other configs makes it so it prints out the error instead.

For the example I provided this message appears 15 times instead of just once, adding 2 more requires goes up to 134 times, adding 2 more requires goes up to 518, etc. In my real repository I have 47 configurations so it looks like it implodes on memory use. I think having multiple configurations somehow introduces a recursive loop for this check that can kill the hosts.

Expected behavior

Skaffold should not hang and run out of memory if on a multi-configuration environment someone introduces the same artifact twice.

Actual behavior

Skaffold hangs on the deploy step indefinitely and kills the host with OOM.

Information

  • Skaffold version: v2.13.2
  • Operating system: fedora 41
  • Installed via: skaffold.dev
  • Contents of skaffold.yaml:

skaffold.yaml

apiVersion: skaffold/v4beta10
kind: Config
metadata:
  name: traefik
deploy:
  helm:
    releases:
      - name: traefik
        repo: https://traefik.github.io/charts
        remoteChart: traefik
        version: 30.0.0
---
apiVersion: skaffold/v4beta10
kind: Config
metadata:
  name: other
requires:
  - path: ./other.yaml
  - path: ./other2.yaml

other.yaml

apiVersion: skaffold/v4beta10
kind: Config
metadata:
  name: postgresql
build:
  tagPolicy:
    gitCommit: {}
  artifacts:
    - image: postgresql-ci
      docker:
        dockerfile: Dockerfile.postgres
  local:
    useBuildkit: true
deploy:
  helm:
    releases:
      - name: postgresql
        repo: https://charts.bitnami.com/bitnami
        remoteChart: postgresql
        version: 14.3.1
        setValueTemplates:
          image:
            registry: "{{.IMAGE_DOMAIN_postgresql_ci}}"
            repository: "{{.IMAGE_REPO_NO_DOMAIN_postgresql_ci}}"
            tag: "{{.IMAGE_TAG_postgresql_ci}}"

other2.yaml

apiVersion: skaffold/v4beta10
kind: Config
metadata:
  name: postgresql2
build:
  tagPolicy:
    gitCommit: {}
  artifacts:
    - image: postgresql-ci
      docker:
        dockerfile: Dockerfile.postgres
  local:
    useBuildkit: true
deploy:
  helm:
    releases:
      - name: postgresql
        repo: https://charts.bitnami.com/bitnami
        remoteChart: postgresql
        version: 14.3.1
        setValueTemplates:
          image:
            registry: "{{.IMAGE_DOMAIN_postgresql_ci}}"
            repository: "{{.IMAGE_REPO_NO_DOMAIN_postgresql_ci}}"
            tag: "{{.IMAGE_TAG_postgresql_ci}}"

Dockerfile.postgres

FROM bitnami/postgresql:15
USER 1001:0

Steps to reproduce the behavior

  1. skaffold deploy -m traefik

then add more other.yaml files and rerun

dmavrommatis avatar Oct 03 '24 16:10 dmavrommatis