apisix icon indicating copy to clipboard operation
apisix copied to clipboard

bug: OOM with Dubbo Proxy Plugin

Open qiyuan4f opened this issue 1 year ago • 0 comments

Current Behavior

oom killer Oct 12 10:35:01 coraool-access-sg-0 kernel: [ 75519] 65534 75519 457289 291585 2580480 0 0 openresty Oct 12 10:35:01 coraool-access-sg-0 kernel: [ 75520] 65534 75520 442926 277219 2461696 0 0 openresty Oct 12 10:35:01 coraool-access-sg-0 kernel: [ 75521] 65534 75521 465360 299665 2650112 0 0 openresty Oct 12 10:35:01 coraool-access-sg-0 kernel: [ 75522] 65534 75522 395427 229694 2080768 0 0 openresty Oct 12 10:35:01 coraool-access-sg-0 kernel: [ 75523] 65534 75523 166286 1122 135168 0 0 openresty Oct 12 10:35:01 coraool-access-sg-0 kernel: [ 75525] 0 75525 173183 6666 192512 0 0 openresty Oct 12 10:35:01 coraool-access-sg-0 kernel: [ 75563] 0 75563 4644 32 61440 0 0 assist_daemon Oct 12 10:35:01 coraool-access-sg-0 kernel: oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_allowed=0,global_oom,task_memcg=/system.slice/apisix.service,task=openresty,pid=75521,uid=65534 Oct 12 10:35:01 coraool-access-sg-0 kernel: Out of memory: Killed process 75521 (openresty) total-vm:1861440kB, anon-rss:1192516kB, file-rss:0kB, shmem-rss:6144kB, UID:65534 pgtables:2588kB oom_score_adj:0 Oct 12 10:35:01 coraool-access-sg-0 systemd[1]: apisix.service: A process of this unit has been killed by the OOM killer. Oct 12 10:35:02 coraool-access-sg-0 runner[66765]: #033[33mWARN#033[0m[2024-10-12T10:34:59+08:00] Response status code: 204 body data: Oct 12 10:35:02 coraool-access-sg-0 runner[66765]: #033[36mINFO#033[0m[2024-10-12T10:34:59+08:00] POST /api/v2/builds/request, time spent 65.98 s Oct 12 10:35:02 coraool-access-sg-0 runner[66765]: #033[36mINFO#033[0m[2024-10-12T10:34:59+08:00] [runner] no new job, skip.

here is the OS memory trend. the hardware CPU core x 4 Memory 8G

image

Expected Behavior

stable running process

Error Logs

The symptom was similar with the bug reported by another dubbo proxy short connection There is no other log we found.

2024/10/13 18:59:11 [error] 5215#5215: *534386 upstream timed out (110: Connection timed out) while connecting to upstream, client: 175.176.18.192, server: _, subrequest: "/track/v1/app/v1", upstream: "dubbo://172.22.15.129:20882" 2024/10/13 18:59:16 [error] 5218#5218: *1805 upstream timed out (110: Connection timed out) while connecting to upstream, client: 175.176.18.192, server: _, subrequest: "/track/v1/app/v1", upstream: "dubbo://172.22.15.121:20882" 2024/10/13 18:59:21 [error] 5215#5215: *534386 upstream timed out (110: Connection timed out) while connecting to upstream, client: 175.176.18.192, server: _, subrequest: "/track/v1/app/v1", upstream: "dubbo://172.22.15.128:20882" 2024/10/13 18:59:26 [error] 5218#5218: *2096143 upstream timed out (110:

Steps to Reproduce

using the 3.10.0 version with ETCD 3.5.16 no other specific software.

follow the official guideline, using WR as stress test, when QPS read 1K, the symptom will be faster.

Environment

  • APISIX version (run apisix version): 3.10.0
  • Operating system (run uname -a): Linux
  • OpenResty / Nginx version (run openresty -V or nginx -V): 12.5.3.2
  • etcd version, if relevant (run curl http://127.0.0.1:9090/v1/server_info): 3.5.16
  • APISIX Dashboard version, if relevant:
  • Plugin runner version, for issues related to plugin runners:
  • LuaRocks version, for installation issues (run luarocks --version):

qiyuan4f avatar Oct 13 '24 11:10 qiyuan4f