lua-upstream-nginx-module icon indicating copy to clipboard operation
lua-upstream-nginx-module copied to clipboard

Compatibility with the resolve directive (string length overflow error)

Open NicoAdrian opened this issue 4 months ago • 7 comments

NGINX open-sourced dynamic DNS resolution one year ago. However, it looks like lua-upstream-nginx-module is not compatible with this. When adding resolve (and specifying the zone directive) in an upstream block, calling get_primary_peers results in this error:

lua entry thread aborted: runtime error: string length overflow#012stack traceback:#012coroutine 0:#012#011[C]: in function 'get_primary_peers'#012#011/usr/share/lua/5.1/resty/upstream/healthcheck.lua:845: in function 'status_page'

Example conf:

upstream my_backend {
  zone zone_my_backend 256k;
  server some_server:80 resolve;
  keepalive 128;
}
local hc = require("resty.upstream.healthcheck")
ngx.print(hc.status_page())

I may add that some_server returns 24 different IPs (12 IPv4 and 12 IPv6). Maybe it's too much ? Anyway, it should not crash that way.

NicoAdrian avatar Sep 09 '25 13:09 NicoAdrian

Thank you for reporting this issue with lua-upstream-nginx-module and NGINX's dynamic DNS resolution.

To better understand and troubleshoot your problem, could you please provide your complete nginx.conf configuration file? This would help me see the full context of your setup, including any relevant directives that might be affecting this behavior.

zhuizhuhaomeng avatar Sep 09 '25 13:09 zhuizhuhaomeng

Thank you for your quick response. I have a dozen of *.conf files so I will try to paste only the relevent parts (some are sensitive):


map $origin $backend {
  default $origin;
  originBad my_backend;

  originA haproxy;
  originB haproxy;
  originC haproxy;
}

upstream haproxy {
  server 127.0.0.1:83;
  keepalive 128;
}



upstream my_backend {
  zone zone_my_backend 256k;
  server some_server:80 resolve;
  keepalive 128;
}

The handler that fails (HTTP 500 and the above error string overflow):

  location = /api/status {
    # No need to log the many systematic internal calls.
    access_log off;
    content_by_lua_block {
      if ngx.var.arg_format == "json" then
        local upstream = require("ngx.upstream")
        local cjson = require("cjson")
        local res = { upstreams = {} }
        local us = upstream.get_upstreams()
        local peers_groups = {
          primary = upstream.get_primary_peers,
          backup = upstream.get_backup_peers
        }

        for _, u in ipairs(us) do
          local statuses = { up = {}, down = {} }
          for group, get_peers in pairs(peers_groups) do
            local ups, downs = {}, {}
            local peers, _ = get_peers(u)
            for _, srv in ipairs(peers) do
              if not srv.down then
                ups[#ups + 1] = srv.name
              else
                downs[#downs + 1] = srv.name
              end
            end
            statuses.up[group] = #ups > 0 and ups or nil
            statuses.down[group] = #downs > 0 and downs or nil
          end
          res.upstreams[u] = statuses
        end
        ngx.header["Content-Type"] = "application/json"
        ngx.print(cjson.encode(res))
      else
        local hc = require("resty.upstream.healthcheck")
        ngx.print(hc.status_page())
      end
    }
  }
proxy_pass http://$backend;

I must say that if I remove the zone directive and the resolve parameter in the my_backend upstream block, it works fine.

NicoAdrian avatar Sep 09 '25 13:09 NicoAdrian

I need the minimal nginx.conf to reproduce this issue on my side.

zhuizhuhaomeng avatar Sep 09 '25 13:09 zhuizhuhaomeng

worker_processes  1;

events {
    worker_connections 1024;
}

http {
    upstream backend {
        zone my_zone 256k;
        server google.com:80 resolve;
    }

    resolver 172.50.0.2;

    server {
        listen 80;
        location = /api/status {
          content_by_lua_block {
              local hc = require("resty.upstream.healthcheck")
              ngx.print(hc.status_page())
          }
        }
    }
}

However, If instead of my_server you set for example google.com it will work because it resolves in only one or 2 IPs. Mine resolves into 24

Edit: Edited the conf file to add google.com as upstream server.

NicoAdrian avatar Sep 09 '25 14:09 NicoAdrian

Nevermind it also crash with google.com . Comment out zone and remove resolve and it works.

NicoAdrian avatar Sep 09 '25 14:09 NicoAdrian

The resolve in server google.com:80 resolve is not available in open source nginx before 1.27.3. The latest supported nginx in openresty is 1.27.1. So please wait for the new version of openresty.

zhuizhuhaomeng avatar Sep 11 '25 00:09 zhuizhuhaomeng

The resolve in server google.com:80 resolve is not available in open source nginx before 1.27.3. The latest supported nginx in openresty is 1.27.1. So please wait for the new version of openresty.

Ok, thank you. Is the new version supporting the resolve parameter already planned for release ?

NicoAdrian avatar Sep 11 '25 07:09 NicoAdrian