apisix icon indicating copy to clipboard operation
apisix copied to clipboard

Discover service error by Consul for apisix version 3.9

Open jzhao20230918 opened this issue 1 year ago • 21 comments

Description

Hello,

We are using Consul as the service discovery and everything was working fine for apisix v3.8. But after upgrade to v3.9, we got following errors:

2024/04/09 06:07:16 [error] 49#49: *8920 [lua] init.lua:91: nodes(): fetch nodes failed by , return default service, client: 10.0.2.59, server: _, request: "GET / HTTP/1.1", host: "" 2024/04/09 06:07:16 [error] 49#49: *8920 [lua] init.lua:548: handle_upstream(): failed to set upstream: no valid upstream node: nil, client: 10.0.2.59, server: _, request: "GET / HTTP/1.1", host: "***"

Nothing else is changed except the apisix version. Thanks a lot.

Environment

  • APISIX version (run apisix version): 3.9
  • Operating system (run uname -a): Linux ip-10-0-2-59 6.2.0-1017-aws #17~22.04.1-Ubuntu SMP Fri Nov 17 21:07:13 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
  • OpenResty / Nginx version (run openresty -V or nginx -V):
  • etcd version, if relevant (run curl http://127.0.0.1:9090/v1/server_info): {"id":"4d193d8b-f4b4-4a8c-9aed-9da25c014839","version":"3.9.0","hostname":"ip-10-0-2-59","boot_time":1712646773,"etcd_version":"3.5.0"}
  • APISIX Dashboard version, if relevant:
  • Plugin runner version, for issues related to plugin runners:
  • LuaRocks version, for installation issues (run luarocks --version):

jzhao20230918 avatar Apr 09 '24 07:04 jzhao20230918

return error as follow:

<html>
<head><title>503 Service Temporarily Unavailable</title></head>
<body>
<center><h1>503 Service Temporarily Unavailable</h1></center>
<hr><center>openresty</center>
<p><em>Powered by <a href="https://apisix.apache.org/">APISIX</a>.</em></p></body>
</html>

jzhao20230918 avatar Apr 10 '24 02:04 jzhao20230918

please share your configurations

shreemaan-abhishek avatar Apr 11 '24 05:04 shreemaan-abhishek

please share your configurations

I'm using https://github.com/apache/apisix/blob/master/conf/config-default.yaml and changed the etcd and consul configuration.

config.yaml.txt

jzhao20230918 avatar Apr 11 '24 07:04 jzhao20230918

btw, I run Apisix with docker image apache/apisix:3.9.0-debian

jzhao20230918 avatar Apr 11 '24 07:04 jzhao20230918

The information you provided is insufficient to attempt reproduction of this bug

shreemaan-abhishek avatar Apr 13 '24 02:04 shreemaan-abhishek

The information you provided is insufficient to attempt reproduction of this bug

here is a simple version of config: apisix: node_listen: 9080 enable_ipv6: false enable_control: true control: ip: "0.0.0.0" port: 9092 deployment: admin: allow_admin:
- 0.0.0.0/0 admin_key: - name: "admin" key: edd1c9f034335f136f87ad84b625c8f1 role: admin - name: "viewer" key: 4054f7cf07e344346cd3f287985e76a2 role: viewer etcd: host: - "http://etcd1.internal:2379" prefix: "/apisix" timeout: 30 discovery: consul: servers: - "http://consul.internal:8500" plugin_attr: prometheus: export_addr: ip: "0.0.0.0" port: 9091

I got the error after started apisix: 2024/04/15 03:29:29 [error] 49#49: *40 lua entry thread aborted: runtime error: /usr/local/apisix/apisix/discovery/consul/init.lua:525: attempt to concatenate local 'svc_port' (a nil value)

I suspect something is wrong here and causes service discovery by consul failed. stack traceback: coroutine 0: /usr/local/apisix/apisix/discovery/consul/init.lua: in function </usr/local/apisix/apisix/discovery/consul/init.lua:362>, context: ngx.timer 2024/04/15 03:42:22 [error] 49#49: *7191 [lua] init.lua:91: nodes(): fetch nodes failed by ws-http-echo, return default service, client: 10.0.2.59, server: _, request: "GET / HTTP/1.1", host: "echo.external.apisix" 2024/04/15 03:42:22 [error] 49#49: *7191 [lua] init.lua:548: handle_upstream(): failed to set upstream: no valid upstream node: nil, client: 10.0.2.59, server: _, request: "GET / HTTP/1.1", host: "echo.external.apisix"

jzhao20230918 avatar Apr 15 '24 03:04 jzhao20230918

root@ip-10-0-2-59:/home/ubuntu# curl -fsL http://127.0.0.1:9092/v1/discovery/consul/dump | jq { "services": {}, "config": { "token": "", "keepalive": true, "weight": 1, "fetch_interval": 3, "timeout": { "connect": 2000, "read": 2000, "wait": 60 }, "servers": [ "http://consul.internal:8500" ], "sort_type": "origin" } }

jzhao20230918 avatar Apr 15 '24 08:04 jzhao20230918

while another node with v3.8: root@ip-10-0-2-59:/home/ubuntu# curl -fsL http://10.0.2.60:9092/v1/discovery/consul/dump | jq { "config": { "keepalive": true, "weight": 1, "timeout": { "connect": 2000, "read": 2000, "wait": 60 }, "fetch_interval": 3, "token": "", "servers": [ "http://consul.internal:8500" ] }, "services": { "alertmanager": [ { "port": 20928, "host": "10.0.2.69", "weight": 1 } ], ...

jzhao20230918 avatar Apr 15 '24 08:04 jzhao20230918

consul version 1.18

jzhao20230918 avatar Apr 15 '24 10:04 jzhao20230918

The information you provided is insufficient to attempt reproduction of this bug

hello, any other info needed?

jzhao20230918 avatar Apr 22 '24 02:04 jzhao20230918

@jzhao20230918 I haven't gotten the time to check this bug but the yaml configuration you shared isn't indented at all. Please fix it.

shreemaan-abhishek avatar Apr 22 '24 04:04 shreemaan-abhishek

apisix:
  node_listen: 9080
  enable_ipv6: false
  enable_control: true
  control:
    ip: "0.0.0.0"
    port: 9092
discovery:
  consul:
    servers:
      - "http://10.0.2.69:8500"
    sort_type: host_sort
    dump:
      path: "consul.dump"
      load_on_init: false
deployment:
  admin:
    allow_admin:  
      - 0.0.0.0/0
    admin_key:
      - name: "admin"
        key: edd1c9f034335f136f87ad84b625c8f1
        role: admin
      - name: "viewer"
        key: 4054f7cf07e344346cd3f287985e76a2
        role: viewer
  etcd:
    host:
      - "http://etcd.internal:2379"
    prefix: "/apisix"
    timeout: 30
plugin_attr:
  prometheus:
    export_addr:
      ip: "0.0.0.0"
      port: 9091

jzhao20230918 avatar Apr 22 '24 04:04 jzhao20230918

might be related to https://github.com/apache/apisix/pull/10941

jzhao20230918 avatar May 08 '24 09:05 jzhao20230918

@jzhao20230918 I haven't gotten the time to check this bug but the yaml configuration you shared isn't indented at all. Please fix it.

I found the roor cause.

Line 525 is introduced in 3.9.0. https://github.com/apache/apisix/blob/release/3.9.1/apisix/discovery/consul/init.lua#L525 :

                local svc_address, svc_port = node.Service.Address, node.Service.Port
                -- if nodes is nil, new nodes table and set to up_services
                if not nodes then
                    nodes = core.table.new(1, 0)
                    up_services[service_name] = nodes
                end
                -- not store duplicate service IDs.
                local service_id = svc_address .. ":" .. svc_port

I had a service registed with consul without port provided and that causes svc_port to be nil. And then the whole service discovery by consul function was down. After I registed the service correctly everthing works as before.

jzhao20230918 avatar May 24 '24 07:05 jzhao20230918

I had a service registed with consul without port provided and that causes svc_port to be nil. And then the whole service discovery by consul function was down. After I registed the service correctly everthing works as before.

Same here. @shreemaan-abhishek Any plan to fix it?

Karmenzind avatar Jun 25 '24 08:06 Karmenzind