diffusers Flux-dev docker image error

Describe the bug

PS F:\cog-flux-dev> docker images
REPOSITORY       TAG       IMAGE ID       CREATED          SIZE
flux-dev-model   latest    c19d0ffc3660   11 minutes ago   55.1GB

After build docker image..i have run that flux-dev image

Run command

docker run --gpus all -it flux-dev-model

The error i have face

RuntimeError: Failed to import diffusers.pipelines.flux.pipeline_flux because of the following error (look up to see its traceback):
Failed to import diffusers.loaders.single_file because of the following error (look up to see its traceback):
Failed to import transformers.models.auto.image_processing_auto because of the following error (look up to see its traceback):
partially initialized module 'torch._dynamo' has no attribute 'external_utils' (most likely due to a circular import)

Reproduction

import os
import time
import torch
from dotenv import load_dotenv
import numpy as np
from PIL import Image
import runpod
import boto3
from typing import List
from pathlib import Path
from diffusers import (
    FluxPipeline,
    FluxImg2ImgPipeline
)
from torchvision import transforms
from transformers import CLIPImageProcessor
from diffusers.pipelines.stable_diffusion.safety_checker import (
    StableDiffusionSafetyChecker
)
from transformers.utils.hub import move_cache

move_cache()
load_dotenv()
S3_BUCKET_NAME = os.getenv('S3_BUCKET_NAME')
AWS_ACCESS_KEY_ID = os.getenv('AWS_ACCESS_KEY_ID')
AWS_SECRET_ACCESS_KEY = os.getenv('AWS_SECRET_ACCESS_KEY')

MAX_IMAGE_SIZE = 1440
MODEL_CACHE = "FLUX.1-dev"
SAFETY_CACHE = "safety-1.0"
FEATURE_EXTRACTOR = "feature-extractor"
#SAFETY_URL = "https://weights.replicate.delivery/default/sdxl/safety-1.0.tar"
#MODEL_URL = "https://weights.replicate.delivery/default/black-forest-labs/FLUX.1-dev/files.tar"

ASPECT_RATIOS = {
    "1:1": (1024, 1024),
    "16:9": (1344, 768),
    "21:9": (1536, 640),
    "3:2": (1216, 832),
    "2:3": (832, 1216),
    "4:5": (896, 1088),
    "5:4": (1088, 896),
    "3:4": (896, 1152),
    "4:3": (1152, 896),
    "9:16": (768, 1344),
    "9:21": (640, 1536),
}


class Predictor:
    def setup(self) -> None:
        """Load the model into memory to make running multiple predictions efficient"""
        start = time.time()

        print("Loading safety checker...")
        if not os.path.exists(SAFETY_CACHE):
            download_weights(SAFETY_URL, SAFETY_CACHE)
        self.safety_checker = StableDiffusionSafetyChecker.from_pretrained(
            SAFETY_CACHE, torch_dtype=torch.float16
        ).to("cuda")
        self.feature_extractor = CLIPImageProcessor.from_pretrained(FEATURE_EXTRACTOR)

        print("Loading Flux txt2img Pipeline")
        if not os.path.exists(MODEL_CACHE):
            download_weights(MODEL_URL, '.')
        self.txt2img_pipe = FluxPipeline.from_pretrained(
            MODEL_CACHE,
            torch_dtype=torch.bfloat16,
            cache_dir=MODEL_CACHE
        ).to("cuda")

        print("Loading Flux img2img pipeline")
        self.img2img_pipe = FluxImg2ImgPipeline(
            transformer=self.txt2img_pipe.transformer,
            scheduler=self.txt2img_pipe.scheduler,
            vae=self.txt2img_pipe.vae,
            text_encoder=self.txt2img_pipe.text_encoder,
            text_encoder_2=self.txt2img_pipe.text_encoder_2,
            tokenizer=self.txt2img_pipe.tokenizer,
            tokenizer_2=self.txt2img_pipe.tokenizer_2,
        ).to("cuda")
        print("setup took: ", time.time() - start)

my docker file

# Base image
FROM runpod/base:0.4.0-cuda11.8.0

# Upgrade pip
RUN python3.10 -m pip install --upgrade pip

# Copy requirements.txt and install Python dependencies
COPY builder/requirements.txt /requirements.txt
RUN python3.10 -m pip install --ignore-installed --upgrade -r /requirements.txt --no-cache-dir && \
    rm /requirements.txt

# Install Hugging Face diffusers from a specific commit
RUN python3.10 -m pip install git+https://github.com/huggingface/diffusers.git@249a9e48e8f8aac4356d5a285c8ba0c600a80f64 --no-cache-dir

RUN python3.10 -m pip install --ignore-installed --upgrade \
    torch==2.4.1 torchvision==0.19.1 torchaudio==2.4.1 \
    --index-url https://download.pytorch.org/whl/cu118 --no-cache-dir

# Optional: Create a symlink for python3.11 as python
RUN ln -s /usr/bin/python3.10 /usr/bin/python

# Copy source files into the container
COPY src /app
WORKDIR /app

# Run the main script
CMD ["python3.10", "-u", "/app/handler.py"]

Logs

No response

System Info


root@60cbc6883d98:/app# diffusers-cli env
The cache for model files in Transformers v4.22.0 has been updated. Migrating your old cache. This is a one-time only operation. You can interrupt this and resume the migration later on by calling `transformers.utils.move_cache()`.
0it [00:00, ?it/s]

Copy-and-paste the text below in your GitHub issue and FILL OUT the two last points.

- 🤗 Diffusers version: 0.31.0.dev0
- Platform: Linux-5.15.153.1-microsoft-standard-WSL2-x86_64-with-glibc2.35
- Running on Google Colab?: No
- Python version: 3.10.12
- PyTorch version (GPU?): 2.4.1+cu118 (True)
- Flax version (CPU?/GPU?/TPU?): not installed (NA)
- Jax version: not installed
- JaxLib version: not installed
- Huggingface_hub version: 0.26.0
- Transformers version: 4.45.2
- Accelerate version: 1.0.1
- PEFT version: not installed
- Bitsandbytes version: not installed
- Safetensors version: 0.4.5
- xFormers version: not installed
- Accelerator: NVIDIA GeForce RTX 4090, 24564 MiB
- Using GPU in script?: <fill in>
- Using distributed or parallel set-up in script?: <fill in>

Who can help?

@sayakpaul @DN6

Oct 19 '24 13:10 tzktok

Hi, you posted these errors:

Failed to import diffusers.pipelines.flux.pipeline_flux because of the following error (look up to see its traceback):
Failed to import diffusers.loaders.single_file because of the following error (look up to see its traceback):
Failed to import transformers.models.auto.image_processing_auto because of the following error (look up to see its traceback)

but they don't provide any information just the look up to see its traceback, you need to search for them in the container logs and post them, with what you provided we won't know what's wrong.

It seems you're trying to load Flux.dev with a 4090 using a container in windows. If you don't have the 4090 as a second GPU or without a GUI, you won't be able to load it, it won't fit, and also even if it has the whole VRAM free, it could not work. I have a 3090 which can load it but I tried on other 24 GB VRAM GPUs and it will still OOM so I guess it also depends on the vendor.

Oct 19 '24 15:10 asomoza

Hi, you posted these errors:
Failed to import diffusers.pipelines.flux.pipeline_flux because of the following error (look up to see its traceback):
Failed to import diffusers.loaders.single_file because of the following error (look up to see its traceback):
Failed to import transformers.models.auto.image_processing_auto because of the following error (look up to see its traceback)
but they don't provide any information just the look up to see its traceback, you need to search for them in the container logs and post them, with what you provided we won't know what's wrong.

It seems you're trying to load Flux.dev with a 4090 using a container in windows. If you don't have the 4090 as a second GPU or without a GUI, you won't be able to load it, it won't fit, and also even if it has the whole VRAM free, it could not work. I have a 3090 which can load it but I tried on other 24 GB VRAM GPUs and it will still OOM so I guess it also depends on the vendor.

i have run this in my local which is 4090rtx 24gp vram works fine ...when i converted this into an docker image to load on serverless cloud the above error occures..I clearly mentioned above issue happens only when i convert into docker image.. @asomoza

Oct 21 '24 05:10 tzktok

I don't think debugging issues within a Docker image is within the scope of this repository. I see many moving parts that might not be under our control.

Oct 21 '24 05:10 sayakpaul

I don't think debugging issues within a Docker image is within the scope of this repository. I see many moving parts that might not be under our control.

How to resolve this one ? @sayakpaul

Oct 21 '24 05:10 tzktok

We cannot be expected to look into Docker related problems. If you can reproduce the error without using Docker, please let us know.

Nov 01 '24 03:11 sayakpaul

Marking this as closed due to inactivity and because this issue is not relevant to Diffusers specifically. It's an import error caused due to, probably, incompatible torch version. As Sayak state, debugging Docker related issues and finding what's wrong in the config file is not something we can look into.

Nov 17 '24 07:11 a-r-r-o-w