pm2
pm2 copied to clipboard
Graviton3 support – intermittent crash/coredump
What's going wrong?
- pm2 daemon crashes (coredump logged in
/var/log/messages)
How could we reproduce this issue?
- it doesn't seem to reproduce consistently. maybe 10% of our deploys (
pm2 startOrReload) result in a crash/coredump. - We just moved from graviton2 to graviton3, so I suspect that the issue is related to an incompatibility with graviton3.
Supporting information
--- PM2 report ----------------------------------------------------------------
Date : Mon Apr 08 2024 18:32:37 GMT+0000 (Coordinated Universal Time)
===============================================================================
--- Daemon -------------------------------------------------
pm2d version : 5.3.0
node version : 18.17.0
node path : /home/ec2-user/.nvm/versions/node/v18.17.0/bin/pm2
argv : /home/ec2-user/.nvm/versions/node/v18.17.0/bin/node,/home/ec2-user/.nvm/versions/node/v18.17.0/lib/node_modules/pm2/lib/Daemon.js
argv0 : node
user : ec2-user
uid : 1000
gid : 1000
uptime : 57min
===============================================================================
--- CLI ----------------------------------------------------
local pm2 : 5.3.0
node version : 18.17.0
node path : /home/ec2-user/.nvm/versions/node/v18.17.0/bin/pm2
argv : /home/ec2-user/.nvm/versions/node/v18.17.0/bin/node,/home/ec2-user/.nvm/versions/node/v18.17.0/bin/pm2,report
argv0 : node
user : ec2-user
uid : 1000
gid : 1000
===============================================================================
--- System info --------------------------------------------
arch : arm64
platform : linux
type : Linux
cpus : unknown
cpus nb : 4
freemem : 14946652160
totalmem : 16449142784
home : /home/ec2-user
===============================================================================
--- PM2 list -----------------------------------------------
┌────┬───────────┬─────────────┬─────────┬─────────┬──────────┬────────┬──────┬───────────┬──────────┬──────────┬──────────┬──────────┐
│ id │ name │ namespace │ version │ mode │ pid │ uptime │ ↺ │ status │ cpu │ mem │ user │ watching │
└────┴───────────┴─────────────┴─────────┴─────────┴──────────┴────────┴──────┴───────────┴──────────┴──────────┴──────────┴──────────┘
===============================================================================
--- Daemon logs --------------------------------------------
/home/ec2-user/.pm2/pm2.log last 20 lines:
PM2 | 2024-04-08T17:26:16: PM2 log: pid=149369 msg=failed to kill - retrying in 100ms
PM2 | 2024-04-08T17:26:16: PM2 log: pid=137480 msg=failed to kill - retrying in 100ms
PM2 | 2024-04-08T17:26:16: PM2 log: Process with pid 149369 still alive after 30000ms, sending it SIGKILL now...
PM2 | 2024-04-08T17:26:17: PM2 log: Process with pid 137480 still alive after 30000ms, sending it SIGKILL now...
PM2 | 2024-04-08T17:34:44: PM2 log: ===============================================================================
PM2 | 2024-04-08T17:34:45: PM2 log: --- New PM2 Daemon started ----------------------------------------------------
PM2 | 2024-04-08T17:34:45: PM2 log: Time : Mon Apr 08 2024 17:34:45 GMT+0000 (Coordinated Universal Time)
PM2 | 2024-04-08T17:34:45: PM2 log: PM2 version : 5.3.0
PM2 | 2024-04-08T17:34:45: PM2 log: Node.js version : 18.17.0
PM2 | 2024-04-08T17:34:45: PM2 log: Current arch : arm64
PM2 | 2024-04-08T17:34:45: PM2 log: PM2 home : /home/ec2-user/.pm2
PM2 | 2024-04-08T17:34:45: PM2 log: PM2 PID file : /home/ec2-user/.pm2/pm2.pid
PM2 | 2024-04-08T17:34:45: PM2 log: RPC socket file : /home/ec2-user/.pm2/rpc.sock
PM2 | 2024-04-08T17:34:45: PM2 log: BUS socket file : /home/ec2-user/.pm2/pub.sock
PM2 | 2024-04-08T17:34:45: PM2 log: Application log path : /home/ec2-user/.pm2/logs
PM2 | 2024-04-08T17:34:45: PM2 log: Worker Interval : 30000
PM2 | 2024-04-08T17:34:45: PM2 log: Process dump file : /home/ec2-user/.pm2/dump.pm2
PM2 | 2024-04-08T17:34:45: PM2 log: Concurrent actions : 2
PM2 | 2024-04-08T17:34:45: PM2 log: SIGTERM timeout : 1600
PM2 | 2024-04-08T17:34:45: PM2 log: ===============================================================================
Just migrated our instances to Graviton2 (from m7g to m6g) and confirmed we are not able to reproduce this issue there.
Coredump backtrace:
(gdb) bt
#0 0x0000000000d61274 in v8::Object::Set(v8::Local<v8::Context>, v8::Local<v8::Value>, v8::Local<v8::Value>) ()
#1 0x0000000000c4df44 [PAC] in node::(anonymous namespace)::ProcessWrap::Spawn(v8::FunctionCallbackInfo<v8::Value> const&) ()
#2 0x0000000000dab0e0 [PAC] in v8::internal::MaybeHandle<v8::internal::Object> v8::internal::(anonymous namespace)::HandleApiCallHelper<false>(v8::internal::Isolate*, v8::internal::Handle<v8::internal::HeapObject>, v8::internal::Handle<v8::internal::HeapObject>, v8::internal::Handle<v8::internal::FunctionTemplateInfo>, v8::internal::Handle<v8::internal::Object>, v8::internal::BuiltinArguments)
()
#3 0x0000000000dac208 [PAC] in v8::internal::Builtin_HandleApiCall(int, unsigned long*, v8::internal::Isolate*) ()
#4 0x000000000168000c [PAC] in Builtins_CEntry_Return1_DontSaveFPRegs_ArgvOnStack_BuiltinExit ()
#5 0x005600000168000c in ?? ()
I have tried downgrading to pm2 5.3.1 and 5.1.1 and ran into the same issue.
I also tried downgrading to node 18.12.1. Upgrading to node 20 appears to fix.