rsync may segfault after failure to create hard link
ext4 supports a maximum number of 65000 hard links per inode.
When this is exceeded and rsync tries to create another hard link, it falls back to copy. However, under certain conditions, it may segfault later after it created another file in the same hardlink group.
Reproducer (requires ext4 and ssh authorization to the local host):
#! /bin/bash
set -xe
if [ ! -d src ]; then
mkdir src
echo "Hallo" > src/hallo.txt
for i in $(seq -w 260); do
ln src/hallo.txt src/hallo_$i.txt
done
fi
# First 248 copies without network for speed
if [ ! -d dst ]; then
mkdir -p dst
for i in $(seq -w 248); do
rsync -r -H \
--link-dest ../../src \
src/. dst/src_$i
done
fi
test -d dst/src_249 && rm -rf dst/src_249
rsync -r -H \
--link-dest ../../src \
$(hostname -s):$(pwd)/src/. dst/src_249
Output (after a lot more output when run for the first time):
buczek@done:/amd/done/C/C8084/hardlink_test$ ./test.sh
+ '[' '!' -d src ']'
+ '[' '!' -d dst ']'
+ test -d dst/src_249
+ rm -rf dst/src_249
++ hostname -s
++ pwd
+ rsync -r -H --link-dest ../../src done:/amd/done/C/C8084/hardlink_test/src/. dst/src_249
rsync: [generator] link "/amd/done/C/C8084/hardlink_test/dst/src_249/hallo_011.txt" => hallo.txt failed: Too many links (31)
./test.sh: line 28: 7702 Segmentation fault (core dumped) rsync -r -H --link-dest ../../src $(hostname -s):$(pwd)/src/. dst/src_249
rsync: connection unexpectedly closed (8 bytes received so far) [sender]
rsync error: error in rsync protocol data stream (code 12) at io.c(228) [sender=3.2.4]
buczek@done:/amd/done/C/C8084/hardlink_test$ rsync: connection unexpectedly closed (3074 bytes received so far) [receiver]
rsync error: error in rsync protocol data stream (code 12) at io.c(228) [receiver=3.2.4]
Output directory after failure:
buczek@done:/amd/done/C/C8084/hardlink_test$ ls -l dst/src_249/
total 52
-rw-rw-r-- 65000 buczek buczek 6 Jun 16 18:02 hallo.txt
-rw-rw-r-- 65000 buczek buczek 6 Jun 16 18:02 hallo_001.txt
-rw-rw-r-- 65000 buczek buczek 6 Jun 16 18:02 hallo_002.txt
-rw-rw-r-- 65000 buczek buczek 6 Jun 16 18:02 hallo_003.txt
-rw-rw-r-- 65000 buczek buczek 6 Jun 16 18:02 hallo_004.txt
-rw-rw-r-- 65000 buczek buczek 6 Jun 16 18:02 hallo_005.txt
-rw-rw-r-- 65000 buczek buczek 6 Jun 16 18:02 hallo_006.txt
-rw-rw-r-- 65000 buczek buczek 6 Jun 16 18:02 hallo_007.txt
-rw-rw-r-- 65000 buczek buczek 6 Jun 16 18:02 hallo_008.txt
-rw-rw-r-- 65000 buczek buczek 6 Jun 16 18:02 hallo_009.txt
-rw-rw-r-- 65000 buczek buczek 6 Jun 16 18:02 hallo_010.txt
-rw-rw-r-- 2 buczek buczek 6 Jun 16 18:16 hallo_011.txt
-rw-rw-r-- 2 buczek buczek 6 Jun 16 18:16 hallo_012.txt
Stack trace:
buczek@done:/amd/done/C/C8084/hardlink_test$ gdb --args ~/git/rsync/rsync -r -H --link-dest ../../src done:/amd/done/C/C8084/hardlink_test/src/. dst/src_249
GNU gdb (GDB) 11.1
Copyright (C) 2021 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-pc-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /home/buczek/git/rsync/rsync...
(gdb) r
Starting program: /home/buczek/git/rsync/rsync -r -H --link-dest ../../src done:/amd/done/C/C8084/hardlink_test/src/. dst/src_249
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/libthread_db.so.1".
[Detaching after fork from child process 7801]
[Detaching after fork from child process 7806]
rsync: [generator] link "/amd/done/C/C8084/hardlink_test/dst/src_249/hallo_011.txt" => hallo.txt failed: Too many links (31)
Program received signal SIGSEGV, Segmentation fault.
0x0000000000441545 in finish_hard_link (file=0x41002, fname=0x7fffffffc630 "hallo_012.txt", fin_ndx=14, stp=0x7fffffffc460, itemizing=1, code=FLOG, alt_dest=-1) at hlink.c:511
511 file->flags = (file->flags & ~FLAG_HLINK_FIRST) | FLAG_HLINK_DONE;
(gdb) where
#0 0x0000000000441545 in finish_hard_link (file=0x41002, fname=0x7fffffffc630 "hallo_012.txt", fin_ndx=14, stp=0x7fffffffc460, itemizing=1, code=FLOG, alt_dest=-1) at hlink.c:511
#1 0x0000000000414cff in try_dests_reg (file=0x7ffff687b7d8, fname=0x7fffffffc630 "hallo_012.txt", ndx=14, cmpbuf=0x7fffffffb320 "../../src/hallo_012.txt", sxp=0x7fffffffc460, find_exact_for_existing=0, itemizing=1, code=FLOG) at generator.c:1034
#2 0x0000000000417416 in recv_generator (fname=0x7fffffffc630 "hallo_012.txt", file=0x7ffff687b7d8, ndx=14, itemizing=1, code=FLOG, f_out=4) at generator.c:1726
#3 0x000000000041959d in generate_files (f_out=4, local_name=0x0) at generator.c:2320
#4 0x000000000042858f in do_recv (f_in=3, f_out=4, local_name=0x0) at main.c:1102
#5 0x0000000000428ea1 in client_run (f_in=5, f_out=4, pid=7801, argc=1, argv=0x4a6718) at main.c:1358
#6 0x00000000004298b0 in start_client (argc=1, argv=0x4a6718) at main.c:1582
#7 0x0000000000429ef1 in main (argc=2, argv=0x4a6710) at main.c:1831
Btw: This depends on the --link-dest option used in the script.
A quick debugging session shows the following series of events:
- first failure is the attempt to link
hallo.txt(group leader) tohallo_011.txtinrecv_generator()->hard_link_check()-> maybe_hard_link()->atomic_create()->hard_link_one(). The file is not created and thee receiver continues. - for the next file,
hallo_012.txtit is attempted to hard link the file from the --link-dest../../src/hallo_012.txtintry_dests_reg(). This also fails. -
try_dests_reg()falls back to copy (goto try_a_copy) andhallo_012.txtgets created as a copy of../../src/hallo_012.txt. - Then
finish_hard_link()gets called forhallo_012.txtwhich walks backward the files in the group. It hard linkshallo_012.txttohallo_011.txt, then gets tohallo.txtand then crashes, becauseF_HL_PREVofhallo.txtis 0 andflist_for_ndx(0, ... )seems to produce an invalid pointer.
Of course, rsync shouldn't dereference an invalid pointer. But I don't see a reasonable way to continue either. In my application I surly don't want hardlinks to be silently converted into copies and I think, I will just add an Abort into the failure path of hard_link_check().