yanbo

Results 31 comments of yanbo

```log [postgres@mdw log]$ cat gpdb-2024-04-18_000000.csv | grep 'deadlock' 2024-04-18 02:09:43.246943 UTC,,,p26809,th-1525422016,,,,0,con380,cmd5036,seg-1,,,,sx1,"LOG","00000","global deadlock detected! Final graph is :{""seg0"": [""p1241 of dtx3430845286 con1088326 waits for a transactionid lock on ShareLock mode, blocked...

![image](https://github.com/greenplum-db/gpdb/assets/48461101/fd48fa98-e244-4396-9600-62dbef672a56) No major changes in disk usage were observed. If the core file is generated, where will it be? I don’t see any core files

$ ulimit -a core file size (blocks, -c) unlimited data seg size (kbytes, -d) unlimited scheduling priority (-e) 0 file size (blocks, -f) unlimited pending signals (-i) 506905 max locked...

$ coredumpctl list No coredumps found. It seems that the system cannot find the core file

$ ls backup_label.old global gpssh.conf pg_commit_ts pg_hba.conf pg_multixact pg_serial pg_stat_tmp pg_twophase pg_xact postmaster.opts base gpbackup_history.db internal.auto.conf pg_distributedlog pg_ident.conf pg_notify pg_snapshots pg_subtrans PG_VERSION postgresql.auto.conf postmaster.pid current_logfiles gpsegconfig_dump log pg_dynshmem pg_logical pg_replslot...

```log 2024-04-18 02:12:15.032553 UTC,,,p2953,th-1525422016,,,,0,,,seg-1,,,,,"LOG","00000","3rd party error log: PrivateRefCount: 8192 total in 1 blocks; 2584 free (0 chunks); 5608 used MdSmgr: 8192 total in 1 blocks; 7656 free (0 chunks); 536...

> Next time when it happens, you can help use gcore to generate a core of the GDD process. Thank you for your suggestion.

``` cat gpdb-2024-04-18_000000.csv | grep 'GddContext' GddContext: 67008200704 total in 7998 blocks; 90592 free (7 chunks); 67008110112 used GddContext: 67209527296 total in 8022 blocks; 90848 free (7 chunks); 67209436448 used...

This situation was found twice, and their Final graphs are as follows ``` ### 1、 "global deadlock detected! Final graph is : {""seg0"": [""p681 of dtx3113507890 con12475241 waits for a...

> @yanboer We have another good practice to trace similar OOM issue: using the following gdb script to print call stacks: > > ``` > handle SIGUSR1 nostop > set...