crawl4ai icon indicating copy to clipboard operation
crawl4ai copied to clipboard

[Bug]: Docker compose is not working

Open Natgho opened this issue 1 year ago • 20 comments

crawl4ai version

latest

Expected Behavior

it should work?

Current Behavior

When I follow the document to the last step and try to run it with local profile, I get an error.

document: https://github.com/unclecode/crawl4ai/blob/main/docs/deprecated/docker-deployment.md

Steps; I filled the .env file. I ran the following command and the answer;

root@abc123:~/crawl4ai# VERSION=all docker compose --profile local-amd64 up -d
service "base-config" has neither an image nor a build context specified: invalid compose project

my docker version;

root@abc123:~/crawl4ai# docker version
Client: Docker Engine - Community
 Version:           27.0.3
 API version:       1.46
 Go version:        go1.21.11
 Git commit:        7d4bcd8
 Built:             Sat Jun 29 00:02:33 2024
 OS/Arch:           linux/amd64
 Context:           default

Server: Docker Engine - Community
 Engine:
  Version:          27.0.3
  API version:      1.46 (minimum version 1.24)
  Go version:       go1.21.11
  Git commit:       662f78c
  Built:            Sat Jun 29 00:02:33 2024
  OS/Arch:          linux/amd64
  Experimental:     false
 containerd:
  Version:          1.7.18
  GitCommit:        ae71819c4f5e67bb4d5ae76a6b735f29cc25774e
 runc:
  Version:          1.7.18
  GitCommit:        v1.1.13-0-g58aa920
 docker-init:
  Version:          0.19.0
  GitCommit:        de40ad0
root@abc123:~/crawl4ai# docker compose version
Docker Compose version v2.28.1

Is this reproducible?

Yes

Inputs Causing the Bug


Steps to Reproduce


Code snippets


OS

Linux

Python version

3.10

Browser

No response

Browser version

No response

Error logs & Screenshots (if applicable)

No response

Natgho avatar Jan 23 '25 15:01 Natgho

Seeking help of docker experts in root causing this issue.

@FractalMind Could you lend us a hand here? 😁

aravindkarnam avatar Jan 31 '25 12:01 aravindkarnam

I am having the same issue using VERSION=all docker compose --profile local-arm64 up -d

80Builder80 avatar Feb 02 '25 04:02 80Builder80

@aravindkarnam @unclecode @Natgho

Deployed on a Raspberry Pi 5 using "docker-compose --profile local-arm64 up -d". Container was built, deployed, and shows Healthy. I have not tested it any further than this.

Courtesy of ChatGPT-01:

"Why it fails

In older versions of Docker Compose (particularly the v1 Python-based version), you could sometimes get away with “dummy” services that existed only to provide config via extends. The new (v2) Compose engine is more strict: if something is listed under services:, it must be an actual, runnable service with a valid image or build context."

"How to fix it

Use an x- config reference (preferred for “base” config) Instead of listing base-config under services:, move those shared settings into a named extension field. Then you can “merge” them into each actual service. For example:

docker-compose.yml:

version: '3.9' # Or omit if you're using Docker Compose v2+ standalone

x-base-config: &base-config ports: - "11235:11235" - "8000:8000" - "9222:9222" - "8080:8080" environment: - CRAWL4AI_API_TOKEN=${CRAWL4AI_API_TOKEN:-} - OPENAI_API_KEY=${OPENAI_API_KEY:-} - CLAUDE_API_KEY=${CLAUDE_API_KEY:-} volumes: - /dev/shm:/dev/shm deploy: resources: limits: memory: 4G reservations: memory: 1G restart: unless-stopped healthcheck: test: ["CMD", "curl", "-f", "http://localhost:11235/health"] interval: 30s timeout: 10s retries: 3 start_period: 40s

services:

Local build services for different platforms

crawl4ai-amd64: <<: *base-config build: context: . dockerfile: Dockerfile args: PYTHON_VERSION: "3.10" INSTALL_TYPE: ${INSTALL_TYPE:-basic} ENABLE_GPU: "false" platforms: - linux/amd64 profiles: ["local-amd64"]

crawl4ai-arm64: <<: *base-config build: context: . dockerfile: Dockerfile args: PYTHON_VERSION: "3.10" INSTALL_TYPE: ${INSTALL_TYPE:-basic} ENABLE_GPU: "false" platforms: - linux/arm64 profiles: ["local-arm64"]

Hub services for different platforms and versions

crawl4ai-hub-amd64: <<: *base-config image: unclecode/crawl4ai:${VERSION:-basic}-amd64 profiles: ["hub-amd64"]

crawl4ai-hub-arm64: <<: *base-config image: unclecode/crawl4ai:${VERSION:-basic}-arm64 profiles: ["hub-arm64"]

Notice:

base-config is no longer declared as a service.
It is defined under an extension key x-base-config.
Each actual service uses the YAML merge operator <<: *base-config to include those settings."

80Builder80 avatar Feb 02 '25 14:02 80Builder80

@unclecode if you want I can build the project on a docker and docker-compose.yaml infrastructure? is there any active work you are doing on this?

Natgho avatar Feb 02 '25 14:02 Natgho

Here's the formatted file. I can't attach a .yml file so you'll need to change the extension.

docker-compose.md

80Builder80 avatar Feb 02 '25 14:02 80Builder80

@Natgho @80Builder80 Appreciate the offer! I’ve developed a new Docker, and it’s at the alpha stage ready this week. I actually need help with thorough testing—if you’re interested, I can share details and technical docs outlining the tasks. It would be a huge help, let me know!

unclecode avatar Feb 02 '25 15:02 unclecode

01:

Oh, I love that! RPi! That’s exactly what I keep saying—“build and design with RPi in mind.” I’m planning to start a decentralized browser network using a lightweight Chromium build that I’ve already compiled from scratch. Plus, an RPi grid is pure fun! 🤩 If you’re into bits and bytes, join me and help out!

unclecode avatar Feb 02 '25 15:02 unclecode

@unclecode For some reason Chromium wasn't installed so I rebuilt the container after updating the Dockerfile with Install base requirements RUN pip install --no-cache-dir -r requirements.txt
&& playwright install chromium

Once that was installed it seems to be working well on the RPi 5. I've tested several of the Crawl4AI functions and haven't had any issues so far.

I will probably move it over to my Orange Pi 5+ simply because it has more RAM and compute. I hope there's no issues on the OPi5 because to OS support is not good.

80Builder80 avatar Feb 03 '25 00:02 80Builder80

@80Builder80 which Dockerfile version you are working with? The one from "main" branch or "next" version the new one, I committed yesterday?

unclecode avatar Feb 03 '25 00:02 unclecode

@unclecode I was using the "main" branch. I try out the "next" version a little later.

80Builder80 avatar Feb 03 '25 00:02 80Builder80

@unclecode I will be happy to help, of course :) I checked the dockerfile and docker-compose.yaml file in the next branch. I test it on macOS (15.3) and the result is still the same.

Natgho avatar Feb 04 '25 20:02 Natgho

/crawl4ai (next) [1]> VERSION=basic docker-compose --profile local-arm64 up
[+] Running 0/0
[+] Running 0/1l4ai-arm64  Building                                                                                                                               0.1s 
[+] Building 2.0s (21/22)                                                                                                                         docker:desktop-linux 
 => [crawl4ai-arm64 internal] load build definition from Dockerfile                                                                                               0.0s
 => => transferring dockerfile: 4.90kB                                                                                                                            0.0s
 => [crawl4ai-arm64 internal] load metadata for docker.io/library/python:3.10-slim                                                                                1.9s
 => [crawl4ai-arm64 auth] library/python:pull token for registry-1.docker.io                                                                                      0.0s 
 => [crawl4ai-arm64 internal] load .dockerignore                                                                                                                  0.0s
 => => transferring context: 2B                                                                                                                                   0.0s
 => CANCELED [crawl4ai-arm64  1/17] FROM docker.io/library/python:3.10-slim@sha256:a03d346e897f21b4cf5cfc21663f3fc56394f07ce29c44043688e5bee786865b               0.0s
 => => resolve docker.io/library/python:3.10-slim@sha256:a03d346e897f21b4cf5cfc21663f3fc56394f07ce29c44043688e5bee786865b                                         0.0s
 => => sha256:18a5f461ace54a9f99dda57ed9adc168f05b71c10cddfbdb9068f6fee407011c 1.75kB / 1.75kB                                                                    0.0s
 => => sha256:48ceb3b1c775b70d47b0af390ac48f9ebc0d9163a877f30890030de3077eaa08 5.31kB / 5.31kB                                                                    0.0s
 => => sha256:a03d346e897f21b4cf5cfc21663f3fc56394f07ce29c44043688e5bee786865b 9.13kB / 9.13kB                                                                    0.0s
 => [crawl4ai-arm64 internal] load build context                                                                                                                  0.0s
 => => transferring context: 30.99kB                                                                                                                              0.0s
 => CACHED [crawl4ai-arm64  2/17] RUN apt-get update && apt-get install -y --no-install-recommends     build-essential     curl     wget     gnupg     git     c  0.0s
 => CACHED [crawl4ai-arm64  3/17] RUN apt-get update && apt-get install -y --no-install-recommends     libglib2.0-0     libnss3     libnspr4     libatk1.0-0      0.0s
 => CACHED [crawl4ai-arm64  4/17] RUN if [ "false" = "true" ] && [ "arm64" = "amd64" ] ; then     apt-get update && apt-get install -y --no-install-recommends    0.0s
 => CACHED [crawl4ai-arm64  5/17] RUN if [ "arm64" = "arm64" ]; then     echo "🦾 Installing ARM-specific optimizations";     apt-get update && apt-get instal     0.0s
 => CACHED [crawl4ai-arm64  6/17] WORKDIR /app                                                                                                                    0.0s
 => CACHED [crawl4ai-arm64  7/17] RUN echo '#!/bin/bash\nif [ "$USE_LOCAL" = "true" ]; then\n    echo "📦 Installing from local source..."\n    pip install --     0.0s
 => CACHED [crawl4ai-arm64  8/17] COPY . /tmp/project/                                                                                                            0.0s
 => CACHED [crawl4ai-arm64  9/17] COPY deploy/docker/requirements.txt .                                                                                           0.0s
 => CACHED [crawl4ai-arm64 10/17] RUN pip install --no-cache-dir -r requirements.txt                                                                              0.0s
 => CACHED [crawl4ai-arm64 11/17] RUN if [ "basic" = "all" ] ; then         pip install --no-cache-dir             torch             torchvision             tor  0.0s
 => CACHED [crawl4ai-arm64 12/17] RUN if [ "basic" = "all" ] ; then         pip install "/tmp/project/[all]" &&         python -m crawl4ai.model_loader ;     el  0.0s
 => CACHED [crawl4ai-arm64 13/17] RUN pip install --no-cache-dir --upgrade pip &&     /tmp/install.sh &&     python -c "import crawl4ai; print('✅ crawl4ai is     0.0s
 => CACHED [crawl4ai-arm64 14/17] RUN playwright install --with-deps chromium                                                                                     0.0s
 => CACHED [crawl4ai-arm64 15/17] COPY deploy/docker/* /app/                                                                                                      0.0s
 => ERROR [crawl4ai-arm64 16/17] COPY deploy/docker/docker-entrypoint.sh /usr/local/bin/                                                                          0.0s
------
[+] Running 0/1m64 16/17] COPY deploy/docker/docker-entrypoint.sh /usr/local/bin/:
 ⠙ Service crawl4ai-arm64  Building                                                                                                                               2.1s 
failed to solve: failed to compute cache key: failed to calculate checksum of ref cb64612c-3e1f-4187-8cc9-8bb4f244574b::vb0hvg2yvgstq5iaz8slr1i8l: "/deploy/docker/docker-entrypoint.sh": not found

I edited the docker-compose.yaml file as it should be. while the base-config approach is good, the “x-” prefix should be used for such sub-parts, otherwise docker-compose will throw an error because it considers this part as a service.

here is the edited version; https://github.com/unclecode/crawl4ai/pull/618/

Natgho avatar Feb 04 '25 21:02 Natgho

@Natgho Beautiful 🤩 appreciate. Later today I check and accept it. You really are a DockerMan :)) aren't you?

unclecode avatar Feb 04 '25 22:02 unclecode

I can't remember the last time I put effort into installing an application :laughing: docker is an amazing invention and I'm grateful to the creator of each and every line of code 🙏

This PR solves most of the problem, but an important piece of the puzzle is missing, “/deploy/docker/docker-entrypoint.sh” This line of the dockerfile adds docker-entrypoint.sh; https://github.com/unclecode/crawl4ai/blob/c308a794e838c93f6861307f63cfac878be5665b/Dockerfile#L151

But this file is not located in the “deploy/docker/docker-entrypoint.sh” directory. Could you have forgotten to add this file? I didn't want to write a .sh file from scratch.

Natgho avatar Feb 04 '25 22:02 Natgho

@Natgho your PR is merged, thx

unclecode avatar Feb 09 '25 05:02 unclecode

on my setup with the default compose/dockerfile it was saying that playwright needs to be installed.

I made a simplified Dockerfile that works with chromium only for "basic" build mode.

  • uses python 3.13
  • install playwright
  • mkdocs might not be useful
  • some dependencies are not needed (at list in basic build mode) -> it reduces the image size and compilation time

https://github.com/loorisr/crawl4ai/blob/main/Dockerfile

loorisr avatar Feb 20 '25 10:02 loorisr

why not just make a compose and use the specific image for your hardware then you can just use docker compose up -d instead of retarded profile commands

services:

Crawl4AI - Web Scraping and Data Extraction Service

crawl4ai: image: unclecode/crawl4ai:basic-amd64 container_name: crawl4ai security_opt: - no-new-privileges:true restart: unless-stopped profiles: ["apps", "all"] networks: - default healthcheck: test: ["CMD", "curl", "-f", "http://localhost:11235/health"] interval: 30s timeout: 10s retries: 3 start_period: 30s ports: - "${CRAWL4AI_PORT:-11235}:11235" volumes: - /dev/shm:/dev/shm - ${DOCKERDIR}/appdata/crawl4ai:/data labels: - homepage.group=AI Tools - homepage.name=Crawl4AI - homepage.icon=globe - homepage.href=https://crawl4ai.${DOMAINNAME_1} - homepage.description=Web Scraping and Data Extraction - homepage.weight=52 - "traefik.enable=true"

RepairYourTech avatar Mar 01 '25 19:03 RepairYourTech

the image unclecode/crawl4ai:basic-amd64 is working out of the box but is 3 months old.

When we build the main github repo, using docker compose but with

    build: 
      context: https://github.com/unclecode/crawl4ai.git
      args:
        PYTHON_VERSION: "3.10"
        INSTALL_TYPE: basic
        ENABLE_GPU: false
      platforms:
          - linux/amd64

the image is building but at the first crawl request, I got the error:

Image

Playwright is not installed

loorisr avatar Mar 02 '25 14:03 loorisr

We need a CI-Pipeline support here, with each new version released, a github actions should be created to automatically push the current image to dockerhub.

@unclecode Do you need help with that?

This document explains exactly what we need; https://github.com/marketplace/actions/build-and-push-docker-images

Natgho avatar Mar 03 '25 19:03 Natgho

We need a CI-Pipeline support here, with each new version released, a github actions should be created to automatically push the current image to dockerhub.

@unclecode Do you need help with that?

This document explains exactly what we need; https://github.com/marketplace/actions/build-and-push-docker-images

i was looking at that, i can probably get ai to do it all

RepairYourTech avatar Mar 04 '25 11:03 RepairYourTech

Hi

Is help needed here still? I am looking to help with docker related issues in projects

Cheers

depach avatar Apr 17 '25 09:04 depach

@depach Hey! So we have a brand new docker setup now which doesn't have this problem! This relates to an older version. So I'll go ahead and close this issue.

Is help needed here still? I am looking to help with docker related issues in projects

For sure. We actually need all the help we can get. You can check out backlog and try to root cause any issues related to docker. Also pickup any root caused issues, attempt a fix and raise a PR. Looking forward to contributions from you.

aravindkarnam avatar May 07 '25 06:05 aravindkarnam