glusterfs
glusterfs copied to clipboard
snapd crashed on opening snapshot directory via gfapi
Description of problem: While trying to open snapshot directory from a USS enabled GlusterFS volume snap daemon crashed with backtrace(mentioned below).
The exact command to reproduce the issue: test-snapdir-open.c.txt
# gcc test-snapdir-open.c -lgfapi
# ./a.out 192.168.122.11 vol /tmp/ .snaps
The full output of the command that failed:
# cat /tmp/test-snapdir-open.log
glfs_open : returned error (Transport endpoint is not connected)
Aborting due to some failures.
Expected results: Program opening snapshot directory returns successfully with exit status 0.
Mandatory info:
- The output of the gluster volume info command:
Volume Name: vol
Type: Distributed-Replicate
Volume ID: 0cda7c3f-b1d0-4873-8ab7-d04195b3f54d
Status: Started
Snapshot Count: 1
Number of Bricks: 2 x (2 + 1) = 6
Transport-type: tcp
Bricks:
Brick1: 192.168.122.11:/brick/brick0
Brick2: 192.168.122.13:/brick/brick0
Brick3: 192.168.122.11:/brick/brick1 (arbiter)
Brick4: 192.168.122.13:/brick/brick1
Brick5: 192.168.122.11:/brick/brick2
Brick6: 192.168.122.13:/brick/brick2 (arbiter)
Options Reconfigured:
user.smb: enable
features.barrier: disable
features.show-snapshot-directory: on
features.uss: enable
performance.write-behind: off
performance.parallel-readdir: on
performance.readdir-ahead: on
performance.nl-cache-timeout: 600
performance.nl-cache: on
network.inode-lru-limit: 200000
performance.md-cache-timeout: 600
performance.cache-invalidation: on
performance.stat-prefetch: on
performance.cache-samba-metadata: on
features.cache-invalidation-timeout: 600
features.cache-invalidation: on
cluster.granular-entry-heal: on
storage.fips-mode-rchecksum: on
transport.address-family: inet
nfs.disable: on
performance.client-io-threads: off
- The output of the gluster volume status command:
Before
Status of volume: vol
Gluster process TCP Port RDMA Port Online Pid
------------------------------------------------------------------------------
Brick 192.168.122.11:/brick/brick0 52120 0 Y 1174
Brick 192.168.122.13:/brick/brick0 59815 0 Y 1175
Brick 192.168.122.11:/brick/brick1 54456 0 Y 1206
Brick 192.168.122.13:/brick/brick1 56448 0 Y 1207
Brick 192.168.122.11:/brick/brick2 53883 0 Y 1238
Brick 192.168.122.13:/brick/brick2 56394 0 Y 1239
Snapshot Daemon on localhost 56245 0 Y 1364
Self-heal Daemon on localhost N/A N/A Y 1390
Snapshot Daemon on dev-vm-2 55527 0 Y 1365
Self-heal Daemon on dev-vm-2 N/A N/A Y 1409
Task Status of Volume vol
------------------------------------------------------------------------------
There are no active volume tasks
After
Status of volume: vol
Gluster process TCP Port RDMA Port Online Pid
------------------------------------------------------------------------------
Brick 192.168.122.11:/brick/brick0 52120 0 Y 1174
Brick 192.168.122.13:/brick/brick0 59815 0 Y 1175
Brick 192.168.122.11:/brick/brick1 54456 0 Y 1206
Brick 192.168.122.13:/brick/brick1 56448 0 Y 1207
Brick 192.168.122.11:/brick/brick2 53883 0 Y 1238
Brick 192.168.122.13:/brick/brick2 56394 0 Y 1239
Snapshot Daemon on localhost N/A N/A N N/A
Self-heal Daemon on localhost N/A N/A Y 1390
Snapshot Daemon on dev-vm-2 55527 0 Y 1365
Self-heal Daemon on dev-vm-2 N/A N/A Y 1409
Task Status of Volume vol
------------------------------------------------------------------------------
There are no active volume tasks
- Is there any crash ? Provide the backtrace and coredump: Yes
Program terminated with signal SIGABRT, Aborted.
#0 __pthread_kill_implementation (threadid=<optimized out>, signo=signo@entry=6, no_tid=no_tid@entry=0) at pthread_kill.c:44
44 return INTERNAL_SYSCALL_ERROR_P (ret) ? INTERNAL_SYSCALL_ERRNO (ret) : 0;
[Current thread is 1 (Thread 0x7f87d74b2640 (LWP 1451))]
(gdb) bt
#0 __pthread_kill_implementation (threadid=<optimized out>, signo=signo@entry=6, no_tid=no_tid@entry=0) at pthread_kill.c:44
#1 0x00007f87ea48ecb3 in __pthread_kill_internal (signo=6, threadid=<optimized out>) at pthread_kill.c:78
#2 0x00007f87ea43e9c6 in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26
#3 0x00007f87ea4287f4 in __GI_abort () at abort.c:79
#4 0x00007f87ea42871b in __assert_fail_base (fmt=0x7f87ea5bbe60 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=0x7f87d75bd0ca "0",
file=0x7f87d75bc000 "../../../../../../glusterfs.git/xlators/features/snapview-server/src/snapview-server.c", line=2217, function=<optimized out>)
at assert.c:92
#5 0x00007f87ea437576 in __GI___assert_fail (assertion=assertion@entry=0x7f87d75bd0ca "0",
file=file@entry=0x7f87d75bc000 "../../../../../../glusterfs.git/xlators/features/snapview-server/src/snapview-server.c", line=line@entry=2217,
function=function@entry=0x7f87d75bd110 <__PRETTY_FUNCTION__.4> "svs_open") at assert.c:101
#6 0x00007f87d75b7ef8 in svs_open (frame=frame@entry=0x1bbe6e0, this=0x1c61000, loc=loc@entry=0x1e46038, flags=flags@entry=0, fd=fd@entry=0x1b3ce80,
xdata=xdata@entry=0x0) at ../../../../../../glusterfs.git/xlators/features/snapview-server/src/snapview-server.c:2217
#7 0x00007f87eaf12334 in default_open_resume (frame=0x1bbe840, this=0x1c62000, loc=0x1e46038, flags=0, fd=0x1b3ce80, xdata=0x0) at defaults.c:1857
#8 0x00007f87eae9810d in call_resume (stub=0x1e46000) at ../../../../glusterfs.git/libglusterfs/src/call-stub.c:2390
#9 0x00007f87d756bf08 in iot_worker (data=0x1bc2f00) at ../../../../../../glusterfs.git/xlators/performance/io-threads/src/io-threads.c:222
#10 0x00007f87ea48ce2d in start_thread (arg=<optimized out>) at pthread_create.c:442
#11 0x00007f87ea512620 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81
Additional info:
- The operating system / glusterfs version: Fedora 36 master @ c27bdbeb107eb2c786f64857189d08cb674b4eb8