node EADDRINUSE error when trying to bind to port after it was closed

Version

20.12.2

Platform

Darwin olivers-mbp.lan 23.5.0 Darwin Kernel Version 23.5.0: Wed May  1 20:12:58 PDT 2024; root:xnu-10063.121.3~5/RELEASE_ARM64_T6000 arm64 arm Darwin

Subsystem

No response

What steps will reproduce the bug?

The following script creates 2 cluster workers and each cluster worker does the following:

Start server (A) on port 0 (random port).
Close server A.
Once server A has closed, start another server (B) on the same port as the previous server (A).

import cluster from 'node:cluster';
import express from 'express';

if (cluster.isPrimary) {
  const numCPUs = 2;

  console.log(`Master process ${process.pid} is running`);

  // Fork workers.
  for (let i = 0; i < numCPUs; i++) {
    cluster.fork();
  }
} else {
  const a = express();
  const b = express();

  const port = 0;

  console.log(`[${process.pid}] [A] call listen on port`, port);
  const serverA = a.listen(port, () => {
    const randomPort = serverA.address().port;
    console.log(`[${process.pid}] [A] listening on port`, randomPort);

    serverA.close((error) => {
      console.log(`[${process.pid}] [A] close`, error);

      console.log(`[${process.pid}] [B] call listen on port`, randomPort);
      const serverB = b.listen(randomPort, () => {
        console.log(`[${process.pid}] [B] listening on port`, randomPort);
      });
      serverB.on('error', (error) => {
        console.log(`[${process.pid}] [B] error`, error);
      });
    });
  });
}

How often does it reproduce? Is there a required condition?

No response

What is the expected behavior? Why is that the expected behavior?

No error.

What do you see instead?

Sometimes, but not always, we see an EADDRINUSE error. For example:

$ node test
Master process 16437 is running
[16438] [A] call listen on port 0
[16439] [A] call listen on port 0
[16438] [A] listening on port 58256
[16438] [A] close undefined
[16438] [B] call listen on port 58256
[16439] [A] listening on port 58256
[16439] [A] close undefined
[16439] [B] call listen on port 58256
[16439] [B] listening on port 58256
[16438] [B] error Error: bind EADDRINUSE null:58256
    at listenOnPrimaryHandle (node:net:1969:18)
    at rr (node:internal/cluster/child:163:12)
    at Worker.<anonymous> (node:internal/cluster/child:113:7)
    at process.onInternalMessage (node:internal/cluster/utils:49:5)
    at process.emit (node:events:530:35)
    at emit (node:internal/child_process:951:14)
    at process.processTicksAndRejections (node:internal/process/task_queues:83:21) {
  errno: -48,
  code: 'EADDRINUSE',
  syscall: 'bind',
  address: null,
  port: 58256
}

It seems to happen more frequently when the CPU is under pressure.

This is not expected because, as far as I understand:

It should be possible to bind to the same port across cluster workers.
Server A has been closed by the time we try to bind server B. (According to the documentation the close callback is only called once the server has closed (i.e. the port has been released?)

Additional information

I have been unable to reproduce the problem with a single cluster worker which suggests the problem only occurs when there's contention between cluster workers.

Jul 05 '24 20:07 OliverJAsh

From what I see. I believe you’re dealing with a race condition. If that happens -more frequently- while CPU is under load, it must be a race condition. What are u trying to achieve? I can’t follow your code. Why are you closing the port to request it again?

On Fri, 5 Jul 2024 at 15:49 Oliver Joseph Ash @.***> wrote:

Version

20.12.2 Platform

Darwin olivers-mbp.lan 23.5.0 Darwin Kernel Version 23.5.0: Wed May 1 20:12:58 PDT 2024; root:xnu-10063.121.3~5/RELEASE_ARM64_T6000 arm64 arm Darwin

Subsystem

No response What steps will reproduce the bug?

The following script creates 2 cluster worker and each cluster worker does the following:

Start server (A) on port 0 (random port).

Close server A, extracting the random port.

Start another server (B) on the same port as the previous server (A).

test.js:

import cluster from 'node:cluster';import express from 'express'; if (cluster.isPrimary) { const numCPUs = 2;

console.log(Master process ${process.pid} is running);

// Fork workers. for (let i = 0; i < numCPUs; i++) { cluster.fork(); }} else { const a = express(); a.get('/', (_req, res) => { res.send('A'); }); const b = express(); b.get('/', (_req, res) => { res.send('B'); });

console.log([${process.pid}] [A] call listen on port, 0); const serverA = a.listen(0, () => { const randomPort = serverA.address().port; console.log([${process.pid}] [A] listening on port, randomPort);
serverA.close((error) => {
  console.log(`[${process.pid}] [A] close`, error);

  console.log(`[${process.pid}] [B] call listen on port`, randomPort);
  const serverB = b.listen(randomPort, () => {
    console.log(
      `[${process.pid}] [B] listening on port`,
      serverB.address().port,
    );
  });
  serverB.on('error', (error) => {
    console.log(`[${process.pid}] [B] error`, error);
  });
});
});}

How often does it reproduce? Is there a required condition?

No response What is the expected behavior? Why is that the expected behavior?

No error. What do you see instead?

Sometimes, but not always, we see an EADDRINUSE error. For example:

$ node testMaster process 16437 is running[16438] [A] call listen on port 0[16439] [A] call listen on port 0[16438] [A] listening on port 58256[16438] [A] close undefined[16438] [B] call listen on port 58256[16439] [A] listening on port 58256[16439] [A] close undefined[16439] [B] call listen on port 58256[16439] [B] listening on port 58256[16438] [B] error Error: bind EADDRINUSE null:58256 at listenOnPrimaryHandle (node:net:1969:18) at rr (node:internal/cluster/child:163:12) at Worker. (node:internal/cluster/child:113:7) at process.onInternalMessage (node:internal/cluster/utils:49:5) at process.emit (node:events:530:35) at emit (node:internal/child_process:951:14) at process.processTicksAndRejections (node:internal/process/task_queues:83:21) { errno: -48, code: 'EADDRINUSE', syscall: 'bind', address: null, port: 58256}

It seems to happen more frequently when the CPU is under pressure.

This is not expected because, as far as I understand:

It should be possible to bind to the same port across cluster workers.

Server A has been closed by the time we try to bind server B. (According to the documentation the close callback is only called once the server has closed (i.e. the port has been released?).)

Additional information

I have been unable to reproduce the problem with a single cluster worker which suggests the problem only occurs when there's contention between cluster workers.

— Reply to this email directly, view it on GitHub https://github.com/nodejs/node/issues/53738, or unsubscribe https://github.com/notifications/unsubscribe-auth/AEBZUN7ZSYFUDSYRW2LS3UDZK4BFZAVCNFSM6AAAAABKNXSZSCVHI2DSMVQWIX3LMV43ASLTON2WKOZSGM4TGMJSHA2DANA . You are receiving this because you are subscribed to this thread.Message ID: @.***>

Jul 05 '24 20:07 juanarbol

I can’t follow your code.

I think this summarizes what the code is trying to do:

The following script creates 2 cluster workers and each cluster worker does the following:

Start server (A) on port 0 (random port).

Close server A.

Once server A has closed, start another server (B) on the same port as the previous server (A).

What are u trying to achieve? Why are you closing the port to request it again?

I am trying to generate a random port and then, at some point later on, use that random port.

I understand I could restructure the code to avoid the need for two servers within the same cluster worker, but I would like to understand why this error is occurring because it doesn't match the behaviour specified in the Node documentation.

Jul 05 '24 21:07 OliverJAsh

AFAIK you can't bind the same port multiple times on system level, as you can only have on in use at a time, so I'm not seeing the issue here.

Jul 05 '24 23:07 avivkeller

If I understand correctly, clustering allows each worker to bind to the same port, for example this does not reproduce the EADDRINUSE error:

import cluster from 'node:cluster';
import express from 'express';

if (cluster.isPrimary) {
  const numCPUs = 2;

  console.log(`Master process ${process.pid} is running`);

  // Fork workers.
  for (let i = 0; i < numCPUs; i++) {
    cluster.fork();
  }
} else {
  const a = express();

  const port = 1234;

  console.log(`[${process.pid}] [A] call listen on port`, port);
  const serverA = a.listen(port, () => {
    console.log(`[${process.pid}] [A] listening on port`, port);
  });
}

However, you can't bind to the same port multiple times within the same cluster worker, for example this consistently reproduces the EADDRINUSE error:

import cluster from 'node:cluster';
import express from 'express';

if (cluster.isPrimary) {
  const numCPUs = 2;

  console.log(`Master process ${process.pid} is running`);

  // Fork workers.
  for (let i = 0; i < numCPUs; i++) {
    cluster.fork();
  }
} else {
  const a = express();
  const b = express();

  const port = 1234;

  console.log(`[${process.pid}] [A] call listen on port`, port);
  const serverA = a.listen(port, () => {
    console.log(`[${process.pid}] [A] listening on port`, port);
  });

  console.log(`[${process.pid}] [B] call listen on port`, port);
  const serverB = b.listen(port, () => {
    console.log(`[${process.pid}] [B] listening on port`, port);
  });
}

But, this isn't what my original reduced test case is doing because it's closing one server before opening another:

import cluster from 'node:cluster';
import express from 'express';

if (cluster.isPrimary) {
  const numCPUs = 2;

  console.log(`Master process ${process.pid} is running`);

  // Fork workers.
  for (let i = 0; i < numCPUs; i++) {
    cluster.fork();
  }
} else {
  const a = express();
  const b = express();

  const port = 0;

  console.log(`[${process.pid}] [A] call listen on port`, port);
  const serverA = a.listen(port, () => {
    const randomPort = serverA.address().port;
    console.log(`[${process.pid}] [A] listening on port`, randomPort);

    serverA.close((error) => {
      console.log(`[${process.pid}] [A] close`, error);

      console.log(`[${process.pid}] [B] call listen on port`, randomPort);
      const serverB = b.listen(randomPort, () => {
        console.log(`[${process.pid}] [B] listening on port`, randomPort);
      });
      serverB.on('error', (error) => {
        console.log(`[${process.pid}] [B] error`, error);
      });
    });
  });
}

Most of the time it works, for example:

Master process 10743 is running
[10745] [A] call listen on port 0
[10744] [A] call listen on port 0
[10745] [A] listening on port 50484
[10744] [A] listening on port 50484
[10745] [A] close undefined
[10745] [B] call listen on port 50484
[10744] [A] close undefined
[10744] [B] call listen on port 50484
[10745] [B] listening on port 50484
[10744] [B] listening on port 50484

It's using the same port in each cluster worker and there's no problem. But occasionally we get an EADDRINUSE error:

Master process 10936 is running
[10937] [A] call listen on port 0
[10938] [A] call listen on port 0
[10938] [A] listening on port 50503
[10938] [A] close undefined
[10938] [B] call listen on port 50503
[10937] [A] listening on port 50503
[10937] [A] close undefined
[10937] [B] call listen on port 50503
[10938] [B] error Error: bind EADDRINUSE null:50503
    at listenOnPrimaryHandle (node:net:1969:18)
    at rr (node:internal/cluster/child:163:12)
    at Worker.<anonymous> (node:internal/cluster/child:113:7)
    at process.onInternalMessage (node:internal/cluster/utils:49:5)
    at process.emit (node:events:530:35)
    at emit (node:internal/child_process:951:14)
    at process.processTicksAndRejections (node:internal/process/task_queues:83:21) {
  errno: -48,
  code: 'EADDRINUSE',
  syscall: 'bind',
  address: null,
  port: 50503
}
[10937] [B] listening on port 50503

According to the documentation, the callback provided to server.close is only called when the server has closed:

the server is finally closed when all connections are ended and the server emits a 'close' event. The optional callback will be called once the 'close' event occurs

https://nodejs.org/api/net.html#serverclosecallback

So I don't understand why the port is still not available within the same cluster worker after the previous server (A) has been closed.

Jul 06 '24 07:07 OliverJAsh

It's also worth noting that I haven't been able to reproduce this problem when I don't use port 0 for the first server (A):

-  const port = 0;
+  const port = 1234;

Jul 06 '24 07:07 OliverJAsh

Another interesting discovery: I can't reproduce this problem when I specify the host as 127.0.0.1 to override the default:

-  const serverA = a.listen(port, () => {
+  const serverA = a.listen(port, '127.0.0.1', () => {

Jul 06 '24 07:07 OliverJAsh

@OliverJAsh Did you close server B?

import cluster from 'node:cluster';
import express from 'express';

if (cluster.isPrimary) {
  const numCPUs = 2;

  console.log(`Master process ${process.pid} is running`);

  // Fork workers.
  for (let i = 0; i < numCPUs; i++) {
    cluster.fork();
  }
} else {
  const a = express();
  const b = express();

  const port = 0;

  console.log(`[${process.pid}] [A] call listen on port`, port);
  const serverA = a.listen(port, () => {
    const randomPort = serverA.address().port;
    console.log(`[${process.pid}] [A] listening on port`, randomPort);

    serverA.close((error) => {
      console.log(`[${process.pid}] [A] close`, error);

      console.log(`[${process.pid}] [B] call listen on port`, randomPort);
      const serverB = b.listen(randomPort, () => {
        console.log(`[${process.pid}] [B] listening on port`, randomPort);

        serverB.close((error) => {
            console.log(`[${process.pid}] [B] close`, error); // make sure to close server B.
        });
      });
    });
  });
}

[2224] [A] call listen on port 0 [2224] [A] listening on port 39065 [2224] [A] close undefined [2224] [B] call listen on port 39065 [2224] [B] listening on port 39065 [2224] [B] close undefined [2225] [A] call listen on port 0 [2225] [A] listening on port 37889 [2225] [A] close undefined [2225] [B] call listen on port 37889 [2225] [B] listening on port 37889 [2225] [B] close undefined

Jun 15 '25 09:06 vieiraarturg

Thanks but I don't want to close server B. I need it to stay open.

Jun 15 '25 14:06 OliverJAsh

As part of the connection clean up, that the close handler needs to be part of the server instantiation.

Jun 16 '25 07:06 vieiraarturg

I'm not sure I understand. The problem is that initializing server B results in an error when it shouldn't. Closing server B is not a solution to that problem.

Jun 16 '25 07:06 OliverJAsh

It looks to me that because server B was not closed, when server B is connected for the second time that the close handler from first connection, then hung the second connection.

[A] call listen on port 0
[10938] [A] listening on port 50503
[10938] [A] close undefined                      // Was closed
[10938] [B] call listen on port 50503       -- Was not properly closed
[10937] [A] listening on port 50503
[10937] [A] close undefined                      // Was closed
[10937] [B] call listen on port 50503       -- When it was called the second time, it threw an error.
[10938] [B] error Error: bind EADDRINUSE null:50503
    at listenOnPrimaryHandle (node:net:1969:18)
    at rr (node:internal/cluster/child:163:12)
    at Worker.<anonymous> (node:internal/cluster/child:113:7)
    at process.onInternalMessage (node:internal/cluster/utils:49:5)
    at process.emit (node:events:530:35)
    at emit (node:internal/child_process:951:14)
    at process.processTicksAndRejections (node:internal/process/task_queues:83:21) {
  errno: -48,
  code: 'EADDRINUSE',
  syscall: 'bind',
  address: null,
  port: 50503
}

const serverB = b.listen(randomPort, () => {
        console.log(`[${process.pid}] [B] listening on port`, randomPort);

        serverB.close((error) => {
            console.log(`[${process.pid}] [B] close`, error); // make sure to close server B.
        });
      });

I tried this code, it worked out no issues.

Jun 16 '25 07:06 vieiraarturg

I do not want to close server B.

For context: I have an application that opens a random server (A) to find a random port, closes that server, then opens another server (B) using the random port. This server needs to stay up.

I am running the application with clustering which is supposed to allow each worker to bind to the same port.

Jun 16 '25 07:06 OliverJAsh

The close event in Node.js servers is emitted when the server has stopped accepting new connections and all existing connections have been closed. It is a crucial part of the server's lifecycle, indicating that the server is no longer active.

Jun 16 '25 08:06 vieiraarturg