haproxy icon indicating copy to clipboard operation
haproxy copied to clipboard

3.3: 520 - term_state PH-- after upgrade from 3.2.9

Open awlx opened this issue 1 month ago • 8 comments

Detailed Description of the Problem

After upgrading to haproxy 3.3.0 using alpn h2,http/1.1 in the server line causes a 502 with PH-- as term_state.

Expected Behavior

Work as before with 3.2.0

Steps to Reproduce the Behavior

Use the following server backend config:

backend meet_backend
    mode http
    balance url_param room
    option httpchk
    hash-type consistent
    option httpchk GET /
    http-check send hdr Host meet.ffmuc.net hdr User-Agent Haproxy-HealthCheck
    http-check expect status 200

    server meet01 de1.ffmeet.net:443 ssl verify none check alpn h2,http/1.1 check-alpn http/1.1
    server meet03 de3.ffmeet.net:443 ssl verify none check alpn h2,http/1.1 check-alpn http/1.1

The check is green, but the answer when doing a curl via haproxy is now a 502.

Dec 08 10:57:17 webfrontend03 haproxy[4125582]: {"message":"10.8.1.0 meet.ffmuc.net - HEAD https://meet.ffmuc.net/ - 502","timestamp":"08/Dec/2025:10:57:17.880","http_host":"meet.ffmuc.net","port":"443","remote_addr":"10.8.1.0","remote_user":"","upstream_addr":"5.1.66.8:443","upstream_cache_status":"","upstream_duration":"3","http_request_method":"HEAD","http_request_uri":"/","http_uri":"https://meet.ffmuc.net/","http_params":"https://meet.ffmuc.net/","http_referer":"-","http_user_agent":"curl/8.17.0","http_protocol_version":"HTTP/3.0","response_status":"502","body_bytes_sent":"104","ssl_protocol":"TLSv1.3","gzip_ratio":"","pid":4125582,"haproxy_frontend_type":"http","haproxy_process_concurrent_connections":1,"haproxy_frontend_concurrent_connections":1,"haproxy_backend_concurrent_connections":0,"haproxy_server_concurrent_connections":0,"haproxy_backend_queue":0,"haproxy_server_queue":0,"haproxy_client_request_send_time":0,"haproxy_queue_wait_time":0,"haproxy_server_wait_time":3,"haproxy_server_response_send_time":-1,"response_time":-1,"session_duration":3,"request_termination_state":"PH--","haproxy_server_connection_retries":0,"remote_port":59719,"frontend_addr":"10.8.0.29","frontend_ssl_version":"TLS_AES_256_GCM_SHA384","frontend_ssl_ciphers":"TLS_AES_256_GCM_SHA384","haproxy_frontend_name":"https-in","haproxy_backend_name":"meet_backend","haproxy_server_name":"meet03","request_size":123}

With haproxy 3.2.9 the same config works flawless. The only option to make it work again is remove the h2 alpn.

The backend server is nginx/1.28.0.

Do you have any idea what may have caused this?

No response

Do you have an idea how to solve the issue?

No response

What is your configuration?

See above. And here is the fixed version: https://github.com/freifunkMUC/ffmuc-salt-public/blob/main/haproxy/haproxy.cfg#L384

Output of haproxy -vv

HAProxy version 3.3.0-0+ha33+ubuntu24.04u2 2025/12/02 - https://haproxy.org/
Status: stable branch - will stop receiving fixes around Q1 2027.
Known bugs: http://www.haproxy.org/bugs/bugs-3.3.0.html
Running on: Linux 6.8.0-87-generic #88-Ubuntu SMP PREEMPT_DYNAMIC Sat Oct 11 09:28:41 UTC 2025 x86_64
Build options :
  TARGET  = linux-glibc
  CC      = x86_64-linux-gnu-gcc
  CFLAGS  = -O2 -g -fwrapv -fvect-cost-model=very-cheap -g -O2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -flto=auto -ffat-lto-objects -fstack-protector-strong -fstack-clash-protection -Wformat -Werror=format-security -fcf-protection -fdebug-prefix-map=/builds/haproxy-ce/deb-haproxy-awslc/debian/output/source_dir=/usr/src/haproxy-awslc-3.3.0-0+ha33+ubuntu24.04u2 -Wdate-time -D_FORTIFY_SOURCE=3
  OPTIONS = USE_OPENSSL=1 USE_OPENSSL_AWSLC=1 USE_LUA=1 USE_SLZ=1 USE_OT=1 USE_QUIC=1 USE_PROMEX=1 USE_MEMORY_PROFILING=1 USE_PCRE2=1 USE_PCRE2_JIT=1
  DEBUG   =
  
Feature list : -51DEGREES +ACCEPT4 +BACKTRACE -CLOSEFROM +CPU_AFFINITY +CRYPT_H -DEVICEATLAS +DL -ECH -ENGINE +EPOLL -EVPORTS +GETADDRINFO -KQUEUE +KTLS -LIBATOMIC +LIBCRYPT +LINUX_CAP +LINUX_SPLICE +LINUX_TPROXY +LUA +MATH +MEMORY_PROFILING +NETFILTER +NS -OBSOLETE_LINKER +OPENSSL +OPENSSL_AWSLC -OPENSSL_WOLFSSL +OT -PCRE +PCRE2 +PCRE2_JIT -PCRE_JIT +POLL +PRCTL -PROCCTL +PROMEX -PTHREAD_EMULATION +QUIC -QUIC_OPENSSL_COMPAT +RT +SHM_OPEN +SLZ +SSL -STATIC_PCRE -STATIC_PCRE2 +TFO +THREAD +THREAD_DUMP +TPROXY -WURFL -ZLIB +ACME

Default settings :
  bufsize = 16384, maxrewrite = 1024, maxpollevents = 200

Built with multi-threading support (MAX_TGROUPS=32, MAX_THREADS=1024, default=6).
Built with SSL library version : OpenSSL 1.1.1 (compatible; AWS-LC 3.0.0)
Running on SSL library version : AWS-LC 3.0.0
SSL library supports TLS extensions : yes
SSL library supports SNI : yes
SSL library FIPS mode : no
SSL library supports : TLSv1.0 TLSv1.1 TLSv1.2 TLSv1.3
QUIC: connection sock-per-conn mode support : yes
QUIC: GSO emission support : yes
Built with Lua version : Lua 5.4.6
Built with the Prometheus exporter as a service
Built with network namespace support.
Built with OpenTracing support.
Built with libslz for stateless compression.
Compression algorithms supported : identity("identity"), deflate("deflate"), raw-deflate("deflate"), gzip("gzip")
Built with transparent proxy support using: IP_TRANSPARENT IPV6_TRANSPARENT IP_FREEBIND
Built with PCRE2 version : 10.42 2022-12-11
PCRE2 library supports JIT : yes
Encrypted password support via crypt(3): yes
Built with gcc compiler version 13.3.0

Available polling systems :   
      epoll : pref=300,  test result OK
       poll : pref=200,  test result OK
     select : pref=150,  test result OK
Total: 3 (3 usable), will use epoll.

Available multiplexer protocols :
(protocols marked as <default> cannot be specified using 'proto' keyword)
       quic : mode=HTTP  side=FE|BE  mux=QUIC  flags=HTX|NO_UPG|FRAMED
         h2 : mode=HTTP  side=FE|BE  mux=H2    flags=HTX|HOL_RISK|NO_UPG
         h1 : mode=HTTP  side=FE|BE  mux=H1    flags=HTX|NO_UPG
  <default> : mode=HTTP  side=FE|BE  mux=H1    flags=HTX
       fcgi : mode=HTTP  side=BE     mux=FCGI  flags=HTX|HOL_RISK|NO_UPG
       spop : mode=SPOP  side=BE     mux=SPOP  flags=HOL_RISK|NO_UPG
  <default> : mode=SPOP  side=BE     mux=SPOP  flags=HOL_RISK|NO_UPG
       none : mode=TCP   side=FE|BE  mux=PASS  flags=NO_UPG
  <default> : mode=TCP   side=FE|BE  mux=PASS  flags=

Available services : prometheus-exporter
Available filters :
        [BWLIM] bwlim-in
        [BWLIM] bwlim-out
        [CACHE] cache
        [COMP] compression
        [FCGI] fcgi-app
        [  OT] opentracing
        [SPOE] spoe
        [TRACE] trace

Last Outputs and Backtraces


Additional Information

No response

awlx avatar Dec 08 '25 09:12 awlx

Hi Annika!

It's annoying because we took great care about exactly this with checks, but I suspect that "something" is causing some difficulties (typically using the H1 mux when H2 was negotiated or something like this). We'll have to try to reproduce with your config!

wtarreau avatar Dec 08 '25 12:12 wtarreau

I'm able to reproduce and indeed, the H1 multiplexer seems to be used, while h2 was negotiated.

capflam avatar Dec 08 '25 14:12 capflam

my bad, a H2 connection is really established. I'm still digging

capflam avatar Dec 08 '25 14:12 capflam

I'm know able to reproduce quite easily the issue. It is a bug with the negotiated ALPN saved in the server parameters. To make it short, it is used to be able to emit 0-RTT data. But if this ALPN does not match the one negotiated for the current connection, an error is triggered by the server but the saved ALPN is never reset. Here, there are 2 issues. First, the ALPN negotiated for health-checks connections must never be saved. Because it may differs from the one used for regular traffic. It may even be performed on a different port. Then, we must be able to reset the saved ALPN on error. Or at least, we must check it against the one negotiated to reset it on mismatch.

I will check how to fix the issue with @cognet tomorrow morning.

capflam avatar Dec 08 '25 17:12 capflam

Hi @awlx,

This should hopefully be fixed in master now, and should be backported to 3.3 soon. If you don't want to wait, you can just apply commits 260d64d7870ed3c4be62f98a4253e25fa3db6a6d, dcce9369129f6ca9b8eed6b451c0e20c226af2e3 and be4e1220c23fd45096e94006beac3b16453470ab, those should apply cleanly on 3.3.

Thanks a lot for reporting!

cognet avatar Dec 09 '25 15:12 cognet

Thanks will test along with the fix for #3211 probably tomorrow, having some problems to build master agains awslc from scratch atm (totally my fault not something from your side :D).

As always many thanks to y'all for the super fast debugging and fixes.

awlx avatar Dec 09 '25 16:12 awlx

For building against aws-lc, I'd recommend to just use the procedure documented in haproxy's INSTALL file. This is the one I'm using when I need and it has always worked for me for now.

wtarreau avatar Dec 09 '25 16:12 wtarreau

Managed to compile now (it was just me mixing up the include paths).

It's now running, on one host of the fleet. So far there are no 502 for the h2 backends.

https://stats.ffmuc.net/d/aab5f489-3e89-45f7-9db1-bc8f73f68936/haproxy-metrics?orgId=1&from=now-3h&to=now&timezone=browser&var-host=webfrontend05&var-proxy=$__all&var-sv=$__all&refresh=1m&viewPanel=panel-30

haproxy -vv
HAProxy version 3.4-dev0-bc8e14-68 2025/12/09 - https://haproxy.org/
Status: development branch - not safe for use in production.
Known bugs: https://github.com/haproxy/haproxy/issues?q=is:issue+is:open
Running on: Linux 6.8.0-87-generic #88-Ubuntu SMP PREEMPT_DYNAMIC Sat Oct 11 09:28:41 UTC 2025 x86_64
Build options :
  TARGET  = linux-glibc
  CC      = cc
  CFLAGS  = -O2 -g -fwrapv -fvect-cost-model=very-cheap
  OPTIONS = USE_OPENSSL=1 USE_OPENSSL_AWSLC=1 USE_QUIC=1 USE_PCRE2=1 USE_PCRE2_JIT=1
  DEBUG   =

Feature list : -51DEGREES +ACCEPT4 +BACKTRACE -CLOSEFROM +CPU_AFFINITY +CRYPT_H -DEVICEATLAS +DL -ECH -ENGINE +EPOLL -EVPORTS +GETADDRINFO -KQUEUE +KTLS -LIBATOMIC +LIBCRYPT +LINUX_CAP +LINUX_SPLICE +LINUX_TPROXY -LUA -MATH -MEMORY_PROFILING +NETFILTER +NS -OBSOLETE_LINKER +OPENSSL +OPENSSL_AWSLC -OPENSSL_WOLFSSL -OT -PCRE +PCRE2 +PCRE2_JIT -PCRE_JIT +POLL +PRCTL -PROCCTL -PROMEX -PTHREAD_EMULATION +QUIC -QUIC_OPENSSL_COMPAT +RT +SHM_OPEN +SLZ +SSL -STATIC_PCRE -STATIC_PCRE2 +TFO +THREAD +THREAD_DUMP +TPROXY -WURFL -ZLIB +ACME

Default settings :
  bufsize = 16384, maxrewrite = 1024, maxpollevents = 200

Built with multi-threading support (MAX_TGROUPS=32, MAX_THREADS=1024, default=6).
Built with SSL library version : OpenSSL 1.1.1 (compatible; AWS-LC 1.65.1)
Running on SSL library version : AWS-LC 1.65.1
SSL library supports TLS extensions : yes
SSL library supports SNI : yes
SSL library FIPS mode : no
SSL library supports : TLSv1.0 TLSv1.1 TLSv1.2 TLSv1.3
QUIC: connection sock-per-conn mode support : yes
QUIC: GSO emission support : yes
Built with network namespace support. 
Built with libslz for stateless compression.
Compression algorithms supported : identity("identity"), deflate("deflate"), raw-deflate("deflate"), gzip("gzip")
Built with transparent proxy support using: IP_TRANSPARENT IPV6_TRANSPARENT IP_FREEBIND
Built with PCRE2 version : 10.42 2022-12-11
PCRE2 library supports JIT : yes
Encrypted password support via crypt(3): yes
Built with gcc compiler version 13.3.0

Available polling systems :
      epoll : pref=300,  test result OK
       poll : pref=200,  test result OK
     select : pref=150,  test result OK
Total: 3 (3 usable), will use epoll.  

Available multiplexer protocols :
(protocols marked as <default> cannot be specified using 'proto' keyword)
       quic : mode=HTTP  side=FE|BE  mux=QUIC  flags=HTX|NO_UPG|FRAMED
         h2 : mode=HTTP  side=FE|BE  mux=H2    flags=HTX|HOL_RISK|NO_UPG
  <default> : mode=HTTP  side=FE|BE  mux=H1    flags=HTX
        h1 : mode=HTTP  side=FE|BE  mux=H1    flags=HTX|NO_UPG
       fcgi : mode=HTTP  side=BE     mux=FCGI  flags=HTX|HOL_RISK|NO_UPG
  <default> : mode=SPOP  side=BE     mux=SPOP  flags=HOL_RISK|NO_UPG
       spop : mode=SPOP  side=BE     mux=SPOP  flags=HOL_RISK|NO_UPG
  <default> : mode=TCP   side=FE|BE  mux=PASS  flags=
       none : mode=TCP   side=FE|BE  mux=PASS  flags=NO_UPG

Available services : none

Available filters :
        [BWLIM] bwlim-in
        [BWLIM] bwlim-out
        [CACHE] cache
        [COMP] compression
        [FCGI] fcgi-app
        [SPOE] spoe
        [TRACE] trace

awlx avatar Dec 10 '25 07:12 awlx

One thing I just spotted is that the update instance of haproxy to the latest master, still shows a higher connection retry count to the h2 backend.

https://stats.ffmuc.net/d/aab5f489-3e89-45f7-9db1-bc8f73f68936/haproxy-metrics?orgId=1&from=now-7d&to=now&timezone=browser&var-host=$__all&var-proxy=$__all&var-sv=$__all&refresh=1m&viewPanel=panel-15

I wonder if there is still something going on.

Edit// Maybe it was just a glitch as it now seems gone.

awlx avatar Dec 11 '25 11:12 awlx

Ok, so I'm closing, everything was backported to 3.3. Thanks !

capflam avatar Dec 12 '25 08:12 capflam