compose icon indicating copy to clipboard operation
compose copied to clipboard

[BUG] Third container stuck on starting and HNS hangs on Windows Server 2022

Open mammadalius opened this issue 2 years ago • 9 comments

Description

Previously I had a Windows Server 2019 and I could start hundreds of containers without any issue. Now I'm trying to renew my environment with a Windows Server 2022 Datacenter.

I have 3 copies of docker-compose.ymls. Each has a exposed port as follows:

version: '3.8'
services:
  web:
    image: "mcr.microsoft.com/windows/nanoserver:ltsc2022"
    ports:
      - "4604:80"
    command: "ping 4.2.2.4 -t"

The exposed ports are different on each yml so there is no conflict.

First container starts, Second container start but when staring the THIRD container, the Host Network Service (HNS) comes up in TaskManager, taking 20-30% of CPU and container stuck on starting and nothing happens after this.

I tried several scenarios to reach the simplest way to reproduce this error. The key point is exposing ports on containers. Without exposing ports I can start multiple containers.

Host Network Service (HNS) use CPU.

What is the difference between networking defaults in Windows Server 2019 and 2022?

Should I start some windows services on 2022?

Steps To Reproduce

No response

Compose Version

No response

Docker Environment

No response

Anything else?

No response

mammadalius avatar Mar 19 '23 22:03 mammadalius

To confirm, do you only see this behavior when using Compose?

If you launch containers via docker run -p 1234:5678 ..., do you see similar hangs?

milas avatar Mar 23 '23 14:03 milas

Hey!

i've got a very similar problem on Windows 11-22H2. But it seems not related to 2.17.X. I've got the same issue with 2.15.1 and 2.16.0. This was working a couple of weeks before. I've also tested with dockerd and 20.10.23 and 23.0.1

Short Problem Description

I cannot run two containers in two docker-compose projects with container-local-port 80 at the same machine. Each project with its own standard nat-network.

But it's only with Port 80!

setup

basic setup:

version: "3.9"
services:
  web:
    image:  mcr.microsoft.com/dotnet/framework/aspnet:4.8
    ports:
      - "10001:80"
    depends_on:
      - mongo
  mongo:
    image: "mongo:5.0-nanoserver"
    ports:
      - "20001:27017"

works witout any problems

docker ps shows

  0.0.0.0:10001->80/tcp       test-web-1
  0.0.0.0:20001->27017/tcp   test-mongo-1

If i start another docker-compose file with both ports incremented by 1 the mongo container with 20002 comes up, BUT 0.0.0.0:10002->80/tcp is not working, instead docker-compose is blocked. Also the container cannot be removed until dockerd is restarted.

Docker ps shows that network-redirections

  0.0.0.0:10001->80/tcp       test-web-1
  0.0.0.0:20001->27017/tcp   test-mongo-1
  0.0.0.0:20002->27017/tcp   test-mongo-2

Deeper ananlysis:

During the first docker-compose is running with ports 10001 and 20001 forwarded and working i can start without any problems another container with 20001->80

docker run --rm -p 0.0.0.0:10002:80 mcr.microsoft.com/dotnet/framework/aspnet:4.8

Thats the same redirection, but it;s not working with docker compose.

Dockerd in foreground shows

time="2023-03-27T20:31:30.341394500+02:00" level=debug msg="Assigning addresses for endpoint compassionate_solomon's interface on network nat"
time="2023-03-27T20:31:30.341394500+02:00" level=debug msg="RequestAddress(172.19.192.0/20, <nil>, map[])"
time="2023-03-27T20:31:30.342119900+02:00" level=debug msg="endpointStruct.EnableInternalDNS =[false]"
time="2023-03-27T20:31:30.342119900+02:00" level=debug msg="[POST]=>[/endpoints/] Request : {\"VirtualNetwork\":\"8C485388-5CB8-45EC-AAF8-1C5F66AE7901\",\"Policies\":[{\"Type\":\"NAT\",\"Protocol\":\"tcp\",\"InternalPort\":80,\"ExternalPort\":10002,\"ExternalPortReserved\":true}],\"EnableInternalDNS\":true}"
time="2023-03-27T20:31:30.362006500+02:00" level=debug msg="Network Response :...

If i start the container with docker-compose it looks similar, but here 0.0.0.0/0 gets requested and no response follows.

time="2023-03-27T19:36:42.919141100+02:00" level=debug msg="Assigning addresses for endpoint test2-web-1's interface on network test2_default"
time="2023-03-27T19:36:42.919705900+02:00" level=debug msg="RequestAddress(0.0.0.0/0, <nil>, map[])"
time="2023-03-27T19:36:42.919705900+02:00" level=debug msg="endpointStruct.EnableInternalDNS =[false]"
time="2023-03-27T19:36:42.919705900+02:00" level=debug msg="[POST]=>[/endpoints/] Request : {\"VirtualNetwork\":\"A03EF562-5FA4-40B8-B5D7-E91E5AA9F788\",\"Policies\":[{\"Type\":\"NAT\",\"Protocol\":\"tcp\",\"InternalPort\":80,\"ExternalPort\":5015,\"ExternalPortReserved\":true}],\"EnableInternalDNS\":true}"

But it's possible to start two times a web-container in the same project with docker compose.

version: "3.9"
services:
  web1:
    image:  mcr.microsoft.com/dotnet/framework/aspnet:4.8
    ports:
      - "11001:80"
    depends_on:
      - mongo
  web2:
    image:  mcr.microsoft.com/dotnet/framework/aspnet:4.8
    ports:
      - "11002:80"
    depends_on:
      - mongo
  mongo:
    image: "mongo:5.0-nanoserver"
    ports:
      - "21001:27017"

So what the heck is this? Bug in Windows?

fashberg avatar Mar 27 '23 18:03 fashberg

Since a couple weeks, I have the same problem with a Windows Server 2022. I'm running multiple services with exposed ports. It worked fine for month. I even restored the whole Server from a backup, but still, after docker compose up, the first service starts and then it stuck with HNS Service taking 20-27% CPU. Another strange thing is that I don't see the virtual network adapters in "Network & Connections".

RETH666 avatar Mar 28 '23 11:03 RETH666

@RETH666 can you please confirm this only applies to container created by docker compose, as requested by https://github.com/docker/compose/issues/10383#issuecomment-1481344482

ndeloof avatar Mar 28 '23 12:03 ndeloof

@ndeloof Yes, I can confirm that. If I run the containers with their ports with docker run, not a problem

RETH666 avatar Mar 28 '23 12:03 RETH666

We have the same problem. Windows Server 2022. Starting multiple containers with docker-compose is not working. Starting them with "docker run ..." works without a problem.

Studiwerk avatar Jul 27 '23 12:07 Studiwerk

Hey! I have got the same issue that I described here (I didn't found a similar bug here before): github.com. So you can maybe link this bug with the one I entered.

I also describe the same issue on the forums.docker.com where is more information about it in the comments.

mariotomek avatar Aug 21 '23 11:08 mariotomek

Hi, I have the same issue and I solved creating an external common network for both the compose configurations:

  • first docker-compose.yml
version: '3.4'
services:
  service1:
    ports:
      - "8440:80"
    networks:
      - mynet
  service2:
    ports:
      - "8441:80"
    networks:
      - mynet
  service3:
    ports:
      - "8442:80"
    networks:
      - mynet
networks:
  mynet:
    name: mynet
    external: true

second docker-compose.yml

version: '3.4'
services:
  service1:
    ports:
      - "8443:80"
    networks:
      - mynet
  service2:
    ports:
      - "8444:80"
    networks:
      - mynet
  service3:
    ports:
      - "8445:80"
    networks:
      - mynet
networks:
  mynet:
    name: mynet
    external: true

The shared network is created by: docker network create --driver nat mynet

valse avatar Apr 04 '24 07:04 valse

Hi,

I believe this error is caused by issue #486 in Windows Containers.

It's a network issue in Windows Server 2022 rather than a Docker Compose issue.

Because it is rare to create networks without orchestrators like Docker Compose, it appears to be Docker Compose that is faulting when it's not.

Docker Compose creates a new network by default for each compose file you run. That's why the error appears when you start the third compose setup, or creating several networks in one compose setup. Spinning up multiple containers outside Docker Compose works because no networks are created.

See issue 486 in the link above how to reproduce this exact without Docker Compose.

leojth avatar Apr 26 '24 09:04 leojth

As @leojth mentioned, this is caused by an issue with Windows Containers. The most recent issue being tracked for the resolution is microsoft/Windows-Containers #140. For me, a temporary workaround for usecases with only one network has been to customise the default network, rather than defining a new one under top-level networks in docker-compose.yml.

marceliwac avatar Jul 09 '24 09:07 marceliwac

ok then, closing this issue so you can watch progress made on https://github.com/microsoft/Windows-Containers to cover this usage

ndeloof avatar Jul 09 '24 09:07 ndeloof