in_tail: Preventing incorrect inode usage from db. #8025
Preventing incorrect inode usage from db.
The first commit involves deleting items that are not being monitored by Fluent-Bit when it starts.
The second commit involves considering already running Fluent-Bit instances, preventing incorrect inode usage during execution. This commit is more about enhancement option than fixing a bug. That's why 'true' and 'false' may be needed depending on the user's use case.
Fixes #8025
Enter [N/A] in the box, if an item is not applicable to your change.
Testing Before we can approve your change; please submit the following in a comment:
-
[x] Example configuration file for the change
-
fluent-bit-test.conf
[SERVICE]
flush 1
daemon Off
log_level info
parsers_file parsers.conf
plugins_file plugins.conf
storage.metrics on
storage.path /tmp/storage
storage.sync normal
storage.checksum off
storage.backlog.mem_limit 5M
[INPUT]
Name tail
Path /tmp/testing/bulk*
Exclude_Path *.gz,*.zip
log_level info
Offset_Key offset
Read_from_Head true
Refresh_Interval 1
Rotate_Wait 1
Inotify_Watcher false
storage.type filesystem
storage.pause_on_chunks_overlimit true
DB /tmp/input.db
DB.sync normal
DB.locking false
# new config
DB.compare_filename true
[OUTPUT]
name stdout
match *
- gen.sh To account for situations where file events may not be properly detected, the creation and deletion process is repeated without using 'sleep'.
#!/bin/bash
max_iterations=100000
for ((iteration = 1; iteration <= max_iterations; iteration++)); do
start_range=$((($iteration - 1) * 100 + 1))
end_range=$(($iteration * 100))
for i in $(seq $start_range $end_range); do
filename="bulk$i.log"
echo "$i." >> "$filename"
stat -c%i "$filename"
done
sleep 0
for i in $(seq $start_range $end_range); do
filename="bulk$i.log"
rm "$filename"
done
done
- (1/2) delete stale inode from db
mkdir -p /tmp/testing/
cd /tmp/testing/
rm -f /tmp/input.db
rm -f /tmp/testing/*.log
# fluentbit start
/opt/fluent-bit/bin/fluent-bit_fixed -c /opt/fluent-bit/etc/fluent-bit/fluent-bit-test.conf
# write
echo "1" > /tmp/testing/bulk.log;stat -c %i /tmp/testing/bulk.log
# fluentbit stop
# logroate and create new one.
rm -f /tmp/testing/bulk.log
# fluentbit start
/opt/fluent-bit/bin/fluent-bit_fixed -c /opt/fluent-bit/etc/fluent-bit/fluent-bit-test.conf
# log
[2023/10/19 05:50:40] [ info] [input:tail:tail.0] db: delete unmonitored stale inodes from the database: count=1
- (2/2) If the filename differs when retrieving the inode from the database, the inode is deleted.
mkdir -p /tmp/testing/
cd /tmp/testing/
rm -f /tmp/input.db
rm -f /tmp/testing/*.log
# fluentbit start
/opt/fluent-bit/bin/fluent-bit_fixed -c /opt/fluent-bit/etc/fluent-bit/fluent-bit-test.conf
# run gen script
while true; do sh ./gen.sh; sleep 0; done
# log
# Fluentbit can read or may fail to read, but it must not read incorrectly.
[2023/10/19 06:24:14] [ info] [input:tail:tail.0] db: exists stale file from database: id=19 inode=4491280 offset=5 name=/tmp/testing/bulk126.log file_inode=4491280 file_name=/tmp/testing/bulk925.log
[2023/10/19 06:24:14] [ info] [input:tail:tail.0] db: stale file deleted from database: id=19
[2023/10/19 06:24:14] [ info] [input:tail:tail.0] db: exists stale file from database: id=20 inode=4491305 offset=5 name=/tmp/testing/bulk127.log file_inode=4491305 file_name=/tmp/testing/bulk926.log
[2023/10/19 06:24:14] [ info] [input:tail:tail.0] db: stale file deleted from database: id=20
- [x] Debug log output from testing the change
(1/2) commit
[2023/10/19 05:13:58] [ info] [input:tail:tail.0] db: delete unmonitored stale inodes from the database: count=1
(2/2) commit
[2023/10/19 05:14:03] [ info] [input:tail:tail.0] db: exists stale file from database: id=1 inode=29632163 offset=12 name=test_db.log file_inode=29632163 file_name=test_db_moved.log
[2023/10/19 05:14:03] [ info] [input:tail:tail.0] db: stale file deleted from database: id=1
- [x] Attached Valgrind output that shows no leaks or memory corruption was found
valgrind --leak-check=full ./bin/flb-rt-in_tail
...
==120724== HEAP SUMMARY:
==120724== in use at exit: 0 bytes in 0 blocks
==120724== total heap usage: 98,919 allocs, 98,919 frees, 32,217,198 bytes allocated
==120724==
==120724== All heap blocks were freed -- no leaks are possible
==120724==
==120724== For lists of detected and suppressed errors, rerun with: -s
==120724== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
If this is a change to packaging of containers or native binaries then please confirm it works for all targets.
- [
N/A] Run local packaging test showing all targets (including any new ones) build. - [
N/A] Setok-package-testlabel to test for all targets (requires maintainer to do).
Documentation
- [x] Documentation required for this feature
Backporting
- [
N/A] Backport to latest stable release.
Fluent Bit is licensed under Apache 2.0, by submitting this pull request I understand that this code will be released under the terms of that license.
This PR is now ready for review. Thanks!
@jinyongchoi thanks for your PR looking forward to merge this, would you please address the comments pointed out by @pwhelan ? thank you
@pwhelan Thank you for the review. @edsiper I have addressed the points mentioned by @pwhelan. Please review the changes at your convenience and let me know if any further adjustments are needed. thank you!
Tested the new implementation, and it works as expected. When the option DB.compare_filename is enabled, the incorrect offset issue doesn't occur
@edsiper I have checked and all my comments have been resolved.
thank you