apisix icon indicating copy to clipboard operation
apisix copied to clipboard

bug: dynamic upstream, one Eureka node is unavailable, half of the requests are lost after reloading

Open liquanzhou opened this issue 4 months ago • 3 comments

Current Behavior

为了实现动态upstream, 配置了两个eureka的ip地址 Image

限制了一个eureka节点的访问, 持续请求无问题 Image

如果重载了apisix,就会出现一半请求失败,稳定复现 Image

discovery: eureka: host: - "http://10.250.200.99:8761" - "http://10.250.200.98:8761" prefix: "/eureka/" fetch_interval: 30 # 30s weight: 100 # default weight for node timeout: connect: 2000 # 2000ms send: 2000 # 2000ms read: 5000 # 5000ms

看起来受这个参数影响,如果设置抓取时间很短, 重载apisix后,会较快恢复 fetch_interval: 30

即使重载后,很短暂的几秒请求丢失, 对于nginx这种最重要的流量入口,也是不可接受的,所以希望能优化一下:

重载或重启时候,一个节点可以连接,一个节点不可用连接,因为两个eureka节点数据一致, 就可以拿到全量动态upstream的服务列表, 不能因为一个eureka节点连接不上,就导致请求一半失败!

Expected Behavior

No response

Error Logs

No response

Steps to Reproduce

1.apisix配置注册中心eureka,两个节点 discovery: eureka: host: - "http://10.250.200.99:8761" - "http://10.250.200.98:8761" prefix: "/eureka/" fetch_interval: 3 # 30s weight: 100 # default weight for node timeout: connect: 2000 # 2000ms send: 2000 # 2000ms read: 2000 # 5000ms

2.在一个eureka节点主机上禁用掉apisix的ip所有请求 iptables -A INPUT -s 10.250.200.202 -j DROP

3.持续curl请求,正常

4.systemctl reload apisix

5.持续curl请求,请求有一半失败

Environment

  • APISIX version (run apisix version):
  • Operating system (run uname -a):
  • OpenResty / Nginx version (run openresty -V or nginx -V):
  • etcd version, if relevant (run curl http://127.0.0.1:9090/v1/server_info):
  • APISIX Dashboard version, if relevant:
  • Plugin runner version, for issues related to plugin runners:
  • LuaRocks version, for installation issues (run luarocks --version):

liquanzhou avatar Sep 11 '25 12:09 liquanzhou

Hi @liquanzhou, thanks for your report. We have verified that this issue does exist and will schedule a fix.

Baoyuantop avatar Sep 12 '25 09:09 Baoyuantop

通过ai生成代码修复,测试是可以解决一个eureka节点宕机后,reload还能正常加载列表,可以当做一个参考

日志中会有持续的eureka节点失败的警告 2025/10/22 19:55:26 [warn] 7225#7225: *40628 [lua] init.lua:200: failed to fetch registry from http://10.250.200.98:8761/eureka/: timeout, context: ngx.timer

eureka.init.lua.txt

liquanzhou avatar Oct 22 '25 11:10 liquanzhou

Hi @liquanzhou, welcome to submit a PR.

Baoyuantop avatar Oct 23 '25 04:10 Baoyuantop