fluent-bit icon indicating copy to clipboard operation
fluent-bit copied to clipboard

in_tail: Preventing incorrect inode usage from db. #8025

Open jinyongchoi opened this issue 2 years ago • 5 comments

Preventing incorrect inode usage from db.

The first commit involves deleting items that are not being monitored by Fluent-Bit when it starts.

The second commit involves considering already running Fluent-Bit instances, preventing incorrect inode usage during execution. This commit is more about enhancement option than fixing a bug. That's why 'true' and 'false' may be needed depending on the user's use case.

Fixes #8025


Enter [N/A] in the box, if an item is not applicable to your change.

Testing Before we can approve your change; please submit the following in a comment:

  • [x] Example configuration file for the change

  • fluent-bit-test.conf


[SERVICE]
    flush        1
    daemon       Off
    log_level    info

    parsers_file parsers.conf
    plugins_file plugins.conf
    storage.metrics on
    storage.path /tmp/storage
    storage.sync normal
    storage.checksum off
    storage.backlog.mem_limit 5M

[INPUT]
    Name tail
    Path /tmp/testing/bulk*
    Exclude_Path *.gz,*.zip
    log_level    info
    Offset_Key   offset

    Read_from_Head true
    Refresh_Interval 1
    Rotate_Wait 1
    Inotify_Watcher false
    
    storage.type filesystem
    storage.pause_on_chunks_overlimit true

    DB /tmp/input.db
    DB.sync normal
    DB.locking false
    # new config
    DB.compare_filename true

[OUTPUT]
    name  stdout
    match *

  • gen.sh To account for situations where file events may not be properly detected, the creation and deletion process is repeated without using 'sleep'.

#!/bin/bash

max_iterations=100000

for ((iteration = 1; iteration <= max_iterations; iteration++)); do
    start_range=$((($iteration - 1) * 100 + 1))
    end_range=$(($iteration * 100))

    for i in $(seq $start_range $end_range); do
        filename="bulk$i.log"
        echo "$i." >> "$filename"
        stat -c%i "$filename"
    done

    sleep 0

    for i in $(seq $start_range $end_range); do
        filename="bulk$i.log"
        rm "$filename"
    done
done

  • (1/2) delete stale inode from db
mkdir -p  /tmp/testing/
cd /tmp/testing/
rm -f /tmp/input.db
rm -f /tmp/testing/*.log

# fluentbit start
/opt/fluent-bit/bin/fluent-bit_fixed -c /opt/fluent-bit/etc/fluent-bit/fluent-bit-test.conf

# write
echo "1" > /tmp/testing/bulk.log;stat -c %i /tmp/testing/bulk.log

# fluentbit stop

# logroate and create new one.
rm -f /tmp/testing/bulk.log

# fluentbit start
/opt/fluent-bit/bin/fluent-bit_fixed -c /opt/fluent-bit/etc/fluent-bit/fluent-bit-test.conf

# log
[2023/10/19 05:50:40] [ info] [input:tail:tail.0] db: delete unmonitored stale inodes from the database: count=1

  • (2/2) If the filename differs when retrieving the inode from the database, the inode is deleted.
mkdir -p  /tmp/testing/
cd /tmp/testing/
rm -f /tmp/input.db
rm -f /tmp/testing/*.log

# fluentbit start
/opt/fluent-bit/bin/fluent-bit_fixed -c /opt/fluent-bit/etc/fluent-bit/fluent-bit-test.conf

# run gen script
while true; do sh ./gen.sh; sleep 0; done

# log
# Fluentbit can read or may fail to read, but it must not read incorrectly.
[2023/10/19 06:24:14] [ info] [input:tail:tail.0] db: exists stale file from database: id=19 inode=4491280 offset=5 name=/tmp/testing/bulk126.log file_inode=4491280 file_name=/tmp/testing/bulk925.log
[2023/10/19 06:24:14] [ info] [input:tail:tail.0] db: stale file deleted from database: id=19
[2023/10/19 06:24:14] [ info] [input:tail:tail.0] db: exists stale file from database: id=20 inode=4491305 offset=5 name=/tmp/testing/bulk127.log file_inode=4491305 file_name=/tmp/testing/bulk926.log
[2023/10/19 06:24:14] [ info] [input:tail:tail.0] db: stale file deleted from database: id=20
  • [x] Debug log output from testing the change
(1/2) commit
[2023/10/19 05:13:58] [ info] [input:tail:tail.0] db: delete unmonitored stale inodes from the database: count=1
(2/2) commit
[2023/10/19 05:14:03] [ info] [input:tail:tail.0] db: exists stale file from database: id=1 inode=29632163 offset=12 name=test_db.log file_inode=29632163 file_name=test_db_moved.log
[2023/10/19 05:14:03] [ info] [input:tail:tail.0] db: stale file deleted from database: id=1
  • [x] Attached Valgrind output that shows no leaks or memory corruption was found
valgrind --leak-check=full ./bin/flb-rt-in_tail
...
==120724== HEAP SUMMARY:
==120724==     in use at exit: 0 bytes in 0 blocks
==120724==   total heap usage: 98,919 allocs, 98,919 frees, 32,217,198 bytes allocated
==120724== 
==120724== All heap blocks were freed -- no leaks are possible
==120724== 
==120724== For lists of detected and suppressed errors, rerun with: -s
==120724== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

If this is a change to packaging of containers or native binaries then please confirm it works for all targets.

  • [N/A] Run local packaging test showing all targets (including any new ones) build.
  • [N/A] Set ok-package-test label to test for all targets (requires maintainer to do).

Documentation

  • [x] Documentation required for this feature

#1238

Backporting

  • [N/A] Backport to latest stable release.

Fluent Bit is licensed under Apache 2.0, by submitting this pull request I understand that this code will be released under the terms of that license.

jinyongchoi avatar Oct 19 '23 06:10 jinyongchoi

This PR is now ready for review. Thanks!

jinyongchoi avatar Oct 19 '23 16:10 jinyongchoi

@jinyongchoi thanks for your PR looking forward to merge this, would you please address the comments pointed out by @pwhelan ? thank you

edsiper avatar Dec 21 '23 18:12 edsiper

@pwhelan Thank you for the review. @edsiper I have addressed the points mentioned by @pwhelan. Please review the changes at your convenience and let me know if any further adjustments are needed. thank you!

jinyongchoi avatar Dec 22 '23 17:12 jinyongchoi

Tested the new implementation, and it works as expected. When the option DB.compare_filename is enabled, the incorrect offset issue doesn't occur

RicardoAAD avatar Dec 22 '23 19:12 RicardoAAD

@edsiper I have checked and all my comments have been resolved.

pwhelan avatar Jan 02 '24 04:01 pwhelan

thank you

edsiper avatar Apr 09 '24 22:04 edsiper