http-server icon indicating copy to clipboard operation
http-server copied to clipboard

Significantly higher latency under load when HTTP keep-alive is enabled

Open s2x opened this issue 7 months ago • 5 comments

Environment:

  • Dell Latitude-5420
  • Ubuntu 25.04
  • kernel: 6.14.0-15-generic
  • PHP 8.4.5 (cli) (built: Mar 17 2025 20:35:32) (NTS)
  • amphp/http-server: 3.4.3

Benchmark tools:

  • wrk debian/4.1.0-4build2 [epoll] Copyright (C) 2012 Will Glozer
  • hey

Description: When running load tests against a minimal “hello world” PHP script served by amphp/http-server, enabling HTTP keep-alive causes per-request latency to jump to ~40 ms, whereas disabling keep-alive (i.e. forcing Connection: close) yields ~150–200 µs response times. Under keep-alive the PHP worker’s CPU usage is effectively zero.

Context: I’m currently developing a Symfony Runtime integration for amphp, and during performance testing I observed these elevated response times. At first I assumed the issue was in my own implementation, but switching to the official hello-world.php example reproduced the same behavior, indicating the problem lies within amphp/http-server itself.

Steps to reproduce:

  1. run php examples/hello-world.php
  2. Run wrk with keep-alive (default): wrk --latency -t 1 -c 1 http://localhost:1337/
Running 10s test @ http://localhost:1337/
  1 threads and 1 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency    41.12ms    1.78ms  43.20ms   98.35%
    Req/Sec    24.30      4.98    30.00     57.00%
  Latency Distribution
     50%   41.03ms
     75%   41.17ms
     90%   42.00ms
     99%   42.98ms
  243 requests in 10.01s, 42.40KB read
Requests/sec:     24.27
Transfer/sec:      4.24KB
  1. Run wrk disabling keep-alive: wrk --header "Connection: Close" --latency -t 1 -c 1 http://localhost:1337/
Running 10s test @ http://localhost:1337/
  1 threads and 1 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency   150.09us   28.03us   0.87ms   92.33%
    Req/Sec     5.52k   434.76     5.90k    90.10%
  Latency Distribution
     50%  143.00us
     75%  151.00us
     90%  169.00us
     99%  277.00us
  55473 requests in 10.10s, 7.88MB read
Requests/sec:   5492.54
Transfer/sec:    799.22KB
  1. Run wrk with keep-alive (default): hey -c 1 -z 10s http://127.0.0.1:1337/
Summary:
  Total:	10.0128 secs
  Slowest:	0.0430 secs
  Fastest:	0.0013 secs
  Average:	0.0412 secs
  Requests/sec:	24.2690
  
  Total data:	3159 bytes
  Size/request:	13 bytes

Response time histogram:
  0.001 [1]	|
  0.005 [0]	|
  0.010 [0]	|
  0.014 [0]	|
  0.018 [0]	|
  0.022 [0]	|
  0.026 [0]	|
  0.030 [0]	|
  0.035 [0]	|
  0.039 [0]	|
  0.043 [242]	|■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■


Latency distribution:
  10% in 0.0409 secs
  25% in 0.0410 secs
  50% in 0.0411 secs
  75% in 0.0419 secs
  90% in 0.0420 secs
  95% in 0.0421 secs
  99% in 0.0429 secs

Details (average, fastest, slowest):
  DNS+dialup:	0.0000 secs, 0.0013 secs, 0.0430 secs
  DNS-lookup:	0.0000 secs, 0.0000 secs, 0.0000 secs
  req write:	0.0000 secs, 0.0000 secs, 0.0001 secs
  resp wait:	0.0006 secs, 0.0002 secs, 0.0014 secs
  resp read:	0.0405 secs, 0.0002 secs, 0.0421 secs

Status code distribution:
  [200]	243 responses
  1. Run hey disabling keep-alive: hey -c 1 -z 10s -disable-keepalive http://127.0.0.1:1337/
Summary:
  Total:	10.0008 secs
  Slowest:	0.0025 secs
  Fastest:	0.0002 secs
  Average:	0.0003 secs
  Requests/sec:	3447.0364
  
  Total data:	448149 bytes
  Size/request:	13 bytes

Response time histogram:
  0.000 [1]	|
  0.000 [32562]	|■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■
  0.001 [841]	|■
  0.001 [272]	|
  0.001 [696]	|■
  0.001 [88]	|
  0.002 [9]	|
  0.002 [2]	|
  0.002 [1]	|
  0.002 [0]	|
  0.003 [1]	|


Latency distribution:
  10% in 0.0002 secs
  25% in 0.0002 secs
  50% in 0.0003 secs
  75% in 0.0003 secs
  90% in 0.0003 secs
  95% in 0.0004 secs
  99% in 0.0010 secs

Details (average, fastest, slowest):
  DNS+dialup:	0.0001 secs, 0.0002 secs, 0.0025 secs
  DNS-lookup:	0.0000 secs, 0.0000 secs, 0.0000 secs
  req write:	0.0000 secs, 0.0000 secs, 0.0009 secs
  resp wait:	0.0001 secs, 0.0001 secs, 0.0012 secs
  resp read:	0.0000 secs, 0.0000 secs, 0.0010 secs

Status code distribution:
  [200]	34473 responses

Actual behavior: With HTTP keep-alive:

  • Latency per request ~40 ms
  • PHP worker idle (≈0% CPU)

Without HTTP keep-alive:

  • Latency per request ~150–200 µs
  • PHP worker fully utilized

Expected behavior: Enabling HTTP keep-alive should either reduce or have minimal impact on per-request latency and maintain high throughput; overall request handling performance should be similar whether the connection is reused or not.

s2x avatar Jun 07 '25 19:06 s2x

Hi @s2x!

Thanks for the detailed report. I just wanted to note this is on my radar and I plan to look further this coming weekend.

trowski avatar Jun 11 '25 03:06 trowski

Testing on my system (macOS, M4 Max) running 8.4.9-dev ZTS DEBUG, I'm not able to replicate the latency you're seeing.

Any further information you could provide?

Test server command: php -dzend.assertions=-1 examples/hello-world.php

wrk with keep-alive: wrk --latency -t 1 -c 1 http://localhost:1337/

Running 10s test @ http://localhost:1337/
  1 threads and 1 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency   101.23us   12.54us 472.00us   79.06%
    Req/Sec     9.84k   399.72    10.30k    79.21%
  Latency Distribution
     50%  100.00us
     75%  104.00us
     90%  113.00us
     99%  148.00us
  98927 requests in 10.10s, 18.49MB read
Requests/sec:   9794.85
Transfer/sec:      1.83MB

wrk without keep-alive: wrk --header "Connection: Close" --latency -t 1 -c 1 http://localhost:1337/

Running 10s test @ http://localhost:1337/
  1 threads and 1 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency   210.81us  120.84us   3.90ms   95.25%
    Req/Sec     1.04k   837.83     3.03k    66.67%
  Latency Distribution
     50%  184.00us
     75%  204.00us
     90%  248.00us
     99%  668.00us
  4817 requests in 10.05s, 606.83KB read
  Socket errors: connect 0, read 4817, write 0, timeout 0
Requests/sec:    479.38
Transfer/sec:     60.39KB

hey with keep-alive: hey -c 1 -z 10s http://127.0.0.1:1337/


Summary:
  Total:        10.0004 secs
  Slowest:      0.0018 secs
  Fastest:      0.0001 secs
  Average:      0.0001 secs
  Requests/sec: 8238.5621
  

Response time histogram:
  0.000 [1]     |
  0.000 [82310] |■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■
  0.000 [70]    |
  0.001 [6]     |
  0.001 [1]     |
  0.001 [0]     |
  0.001 [0]     |
  0.001 [0]     |
  0.001 [0]     |
  0.002 [0]     |
  0.002 [1]     |


Latency distribution:
  10% in 0.0001 secs
  25% in 0.0001 secs
  50% in 0.0001 secs
  75% in 0.0001 secs
  90% in 0.0001 secs
  95% in 0.0001 secs
  99% in 0.0002 secs

Details (average, fastest, slowest):
  DNS+dialup:   0.0000 secs, 0.0001 secs, 0.0018 secs
  DNS-lookup:   0.0000 secs, 0.0000 secs, 0.0000 secs
  req write:    0.0000 secs, 0.0000 secs, 0.0003 secs
  resp wait:    0.0001 secs, 0.0001 secs, 0.0007 secs
  resp read:    0.0000 secs, 0.0000 secs, 0.0002 secs

Status code distribution:
  [200] 82389 responses

hey without keep-alive: hey -c 1 -z 10s -disable-keepalive http://127.0.0.1:1337/


Summary:
  Total:        10.0002 secs
  Slowest:      0.0083 secs
  Fastest:      0.0002 secs
  Average:      0.0004 secs
  Requests/sec: 2319.9600
  

Response time histogram:
  0.000 [1]     |
  0.001 [22986] |■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■
  0.002 [207]   |
  0.003 [1]     |
  0.003 [1]     |
  0.004 [1]     |
  0.005 [1]     |
  0.006 [1]     |
  0.007 [0]     |
  0.008 [0]     |
  0.008 [1]     |


Latency distribution:
  10% in 0.0002 secs
  25% in 0.0003 secs
  50% in 0.0004 secs
  75% in 0.0005 secs
  90% in 0.0007 secs
  95% in 0.0008 secs
  99% in 0.0010 secs

Details (average, fastest, slowest):
  DNS+dialup:   0.0002 secs, 0.0002 secs, 0.0083 secs
  DNS-lookup:   0.0000 secs, 0.0000 secs, 0.0000 secs
  req write:    0.0000 secs, 0.0000 secs, 0.0003 secs
  resp wait:    0.0002 secs, 0.0001 secs, 0.0007 secs
  resp read:    0.0000 secs, 0.0000 secs, 0.0001 secs

Status code distribution:
  [200] 23200 responses

trowski avatar Jun 15 '25 14:06 trowski

Hey, thanks for your quick interest in the topic.

I ran a few more tests with different versions of php and amphp. I always get the same results, slower with keep-alive. I don't have any extensions installed, such as libuv, libevent.

The only thing that's different about my environment is that it runs tests on the NTS version of PHP, and from what I can see you have ZTS. However, it shouldn't make much difference. Tomorrow I'll try to compile PHP from the sources in ZTS mode and see if it changes anything.

It seems to be some kind of Linux-related problem. To rule out problems with my computer, I ran a simple test on GCP on e2-micro machines. The results are slightly different but still show the same problem.

wrk -t 1 -c 1  http://127.0.0.1:1337/

Running 10s test @ http://127.0.0.1:1337/
  1 threads and 1 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency     4.99ms  167.60us   8.62ms   97.25%
    Req/Sec   200.96      2.50   212.00     95.00%
  2001 requests in 10.00s, 383.00KB read
Requests/sec:    200.05
Transfer/sec:     38.29KB
wrk -t 1 -c 1 -H "connection: close"  http://127.0.0.1:1337/

Running 10s test @ http://127.0.0.1:1337/
  1 threads and 1 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency   401.45us   77.97us   1.52ms   90.81%
    Req/Sec     1.79k    76.96     1.91k    70.37%
  4819 requests in 10.02s, 607.08KB read
  Socket errors: connect 0, read 4819, write 0, timeout 0
Requests/sec:    481.01
Transfer/sec:     60.60KB

As you can see, on GCP I have 5ms instead of 40ms locally, but without keep-alive it is still faster.

Unfortunately, I don not have access to Mac machines to test this.

s2x avatar Jun 15 '25 21:06 s2x

Hey, Thanks for your reply — it helped me realize I should look into the differences between Mac and Linux environments. After a quick search online, I figured out what was causing the issue and managed to get much better response times. So the task can be considered done now.

For anyone facing a similar problem, here are the details:

It turns out that the TCP stack probably works differently on Mac than it does on Linux. The strange delays weren’t caused by bad code, but by the Nagle algorithm. To prevent these delays, you need to use the NO_DELAY option on the socket.

You can easily do this by adding BindContext in the expose function:

$server = new SocketHttpServer(...);
$server->expose("0.0.0.0:1337", (new Socket\BindContext())->withTcpNoDelay());

result with NO_DELAY set to true

Running 10s test @ http://127.0.0.1:1337/
  1 threads and 1 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency    43.05us   15.01us   1.06ms   97.45%
    Req/Sec    23.32k     1.10k   24.43k    88.12%
  Latency Distribution
     50%   41.00us
     75%   43.00us
     90%   51.00us
     99%   71.00us
  234375 requests in 10.10s, 43.81MB read
Requests/sec:  23205.99
Transfer/sec:      4.34MB

result with default settings

Running 10s test @ http://127.0.0.1:1337/
  1 threads and 1 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency    40.92ms    2.15ms  43.66ms   99.18%
    Req/Sec    24.44      5.04    30.00     51.00%
  Latency Distribution
     50%   41.01ms
     75%   41.10ms
     90%   41.19ms
     99%   42.11ms
  245 requests in 10.03s, 47.06KB read
Requests/sec:     24.42
Transfer/sec:      4.69KB

The only thing that could be added is perhaps a small mention in the documentation that such a situation may occur.

s2x avatar Jun 16 '25 09:06 s2x

Nagle has reared his head to ruined some benchmarks before. Real-world applications won't be as affected I think. Perhaps even benefit.

Perhaps the best thing to do would be to enable it in the examples with a comment as to why and add a note about it in the readme.

trowski avatar Jun 17 '25 00:06 trowski