snapd crashed on opening snapshot directory via gfapi

Open anoopcs9 opened this issue 3 years ago • 0 comments

Description of problem: While trying to open snapshot directory from a USS enabled GlusterFS volume snap daemon crashed with backtrace(mentioned below).

The exact command to reproduce the issue: test-snapdir-open.c.txt

# gcc test-snapdir-open.c -lgfapi
# ./a.out 192.168.122.11 vol /tmp/ .snaps

The full output of the command that failed:

# cat /tmp/test-snapdir-open.log 

glfs_open : returned error (Transport endpoint is not connected)

Aborting due to some failures.

glfs-client.log

Expected results: Program opening snapshot directory returns successfully with exit status 0.

Mandatory info: - The output of the gluster volume info command:

Volume Name: vol
Type: Distributed-Replicate
Volume ID: 0cda7c3f-b1d0-4873-8ab7-d04195b3f54d
Status: Started
Snapshot Count: 1
Number of Bricks: 2 x (2 + 1) = 6
Transport-type: tcp
Bricks:
Brick1: 192.168.122.11:/brick/brick0
Brick2: 192.168.122.13:/brick/brick0
Brick3: 192.168.122.11:/brick/brick1 (arbiter)
Brick4: 192.168.122.13:/brick/brick1
Brick5: 192.168.122.11:/brick/brick2
Brick6: 192.168.122.13:/brick/brick2 (arbiter)
Options Reconfigured:
user.smb: enable
features.barrier: disable
features.show-snapshot-directory: on
features.uss: enable
performance.write-behind: off
performance.parallel-readdir: on
performance.readdir-ahead: on
performance.nl-cache-timeout: 600
performance.nl-cache: on
network.inode-lru-limit: 200000
performance.md-cache-timeout: 600
performance.cache-invalidation: on
performance.stat-prefetch: on
performance.cache-samba-metadata: on
features.cache-invalidation-timeout: 600
features.cache-invalidation: on
cluster.granular-entry-heal: on
storage.fips-mode-rchecksum: on
transport.address-family: inet
nfs.disable: on
performance.client-io-threads: off

- The output of the gluster volume status command: Before

Status of volume: vol
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 192.168.122.11:/brick/brick0          52120     0          Y       1174 
Brick 192.168.122.13:/brick/brick0          59815     0          Y       1175 
Brick 192.168.122.11:/brick/brick1          54456     0          Y       1206 
Brick 192.168.122.13:/brick/brick1          56448     0          Y       1207 
Brick 192.168.122.11:/brick/brick2          53883     0          Y       1238 
Brick 192.168.122.13:/brick/brick2          56394     0          Y       1239 
Snapshot Daemon on localhost                56245     0          Y       1364 
Self-heal Daemon on localhost               N/A       N/A        Y       1390 
Snapshot Daemon on dev-vm-2                 55527     0          Y       1365 
Self-heal Daemon on dev-vm-2                N/A       N/A        Y       1409 
 
Task Status of Volume vol
------------------------------------------------------------------------------
There are no active volume tasks

After

Status of volume: vol
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 192.168.122.11:/brick/brick0          52120     0          Y       1174 
Brick 192.168.122.13:/brick/brick0          59815     0          Y       1175 
Brick 192.168.122.11:/brick/brick1          54456     0          Y       1206 
Brick 192.168.122.13:/brick/brick1          56448     0          Y       1207 
Brick 192.168.122.11:/brick/brick2          53883     0          Y       1238 
Brick 192.168.122.13:/brick/brick2          56394     0          Y       1239 
Snapshot Daemon on localhost                N/A       N/A        N       N/A  
Self-heal Daemon on localhost               N/A       N/A        Y       1390 
Snapshot Daemon on dev-vm-2                 55527     0          Y       1365 
Self-heal Daemon on dev-vm-2                N/A       N/A        Y       1409 
 
Task Status of Volume vol
------------------------------------------------------------------------------
There are no active volume tasks

- Is there any crash ? Provide the backtrace and coredump: Yes

Program terminated with signal SIGABRT, Aborted.
#0  __pthread_kill_implementation (threadid=<optimized out>, signo=signo@entry=6, no_tid=no_tid@entry=0) at pthread_kill.c:44
44	      return INTERNAL_SYSCALL_ERROR_P (ret) ? INTERNAL_SYSCALL_ERRNO (ret) : 0;
[Current thread is 1 (Thread 0x7f87d74b2640 (LWP 1451))]
(gdb) bt
#0  __pthread_kill_implementation (threadid=<optimized out>, signo=signo@entry=6, no_tid=no_tid@entry=0) at pthread_kill.c:44
#1  0x00007f87ea48ecb3 in __pthread_kill_internal (signo=6, threadid=<optimized out>) at pthread_kill.c:78
#2  0x00007f87ea43e9c6 in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26
#3  0x00007f87ea4287f4 in __GI_abort () at abort.c:79
#4  0x00007f87ea42871b in __assert_fail_base (fmt=0x7f87ea5bbe60 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=0x7f87d75bd0ca "0", 
    file=0x7f87d75bc000 "../../../../../../glusterfs.git/xlators/features/snapview-server/src/snapview-server.c", line=2217, function=<optimized out>)
    at assert.c:92
#5  0x00007f87ea437576 in __GI___assert_fail (assertion=assertion@entry=0x7f87d75bd0ca "0", 
    file=file@entry=0x7f87d75bc000 "../../../../../../glusterfs.git/xlators/features/snapview-server/src/snapview-server.c", line=line@entry=2217, 
    function=function@entry=0x7f87d75bd110 <__PRETTY_FUNCTION__.4> "svs_open") at assert.c:101
#6  0x00007f87d75b7ef8 in svs_open (frame=frame@entry=0x1bbe6e0, this=0x1c61000, loc=loc@entry=0x1e46038, flags=flags@entry=0, fd=fd@entry=0x1b3ce80, 
    xdata=xdata@entry=0x0) at ../../../../../../glusterfs.git/xlators/features/snapview-server/src/snapview-server.c:2217
#7  0x00007f87eaf12334 in default_open_resume (frame=0x1bbe840, this=0x1c62000, loc=0x1e46038, flags=0, fd=0x1b3ce80, xdata=0x0) at defaults.c:1857
#8  0x00007f87eae9810d in call_resume (stub=0x1e46000) at ../../../../glusterfs.git/libglusterfs/src/call-stub.c:2390
#9  0x00007f87d756bf08 in iot_worker (data=0x1bc2f00) at ../../../../../../glusterfs.git/xlators/performance/io-threads/src/io-threads.c:222
#10 0x00007f87ea48ce2d in start_thread (arg=<optimized out>) at pthread_create.c:442
#11 0x00007f87ea512620 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81

Additional info:

- The operating system / glusterfs version: Fedora 36 master @ c27bdbeb107eb2c786f64857189d08cb674b4eb8

Jul 25 '22 09:07 anoopcs9