Using Read_From_Tail with in_systemd gives no input
Bug Report
Describe the bug
When setting Read_From_Tail to On with the sysemd input plugin, no logs end up getting read/streamed to fluent-bit. Removing/setting to 'Off' results in logs being streamed from the start of the journal (as expected) and the plugin has no problem following the tail after that.
To Reproduce Steps to reproduce the problem:
- Create a configuration with a single [INPUT] from systemd and [OUTPUT] to stdout.
- Set
Read_From_Tailto On
Configuration used:
[SERVICE]
Log_Level debug
[INPUT]
Name systemd
Path /var/log/journal
Read_From_Tail On
[OUTPUT]
Name stdout
Match *
Logs:
____________________
< Fluent Bit v2.2.2 >
-------------------
...
[2024/01/19 12:01:25] [ info] Configuration:
[2024/01/19 12:01:25] [ info] flush time | 1.000000 seconds
[2024/01/19 12:01:25] [ info] grace | 5 seconds
[2024/01/19 12:01:25] [ info] daemon | 0
[2024/01/19 12:01:25] [ info] ___________
[2024/01/19 12:01:25] [ info] inputs:
[2024/01/19 12:01:25] [ info] systemd
[2024/01/19 12:01:25] [ info] ___________
[2024/01/19 12:01:25] [ info] filters:
[2024/01/19 12:01:25] [ info] ___________
[2024/01/19 12:01:25] [ info] outputs:
[2024/01/19 12:01:25] [ info] stdout.0
[2024/01/19 12:01:25] [ info] ___________
[2024/01/19 12:01:25] [ info] collectors:
[2024/01/19 12:01:25] [ info] [fluent bit] version=2.2.2, commit=, pid=25006
[2024/01/19 12:01:25] [debug] [engine] coroutine stack size: 196608 bytes (192.0K)
[2024/01/19 12:01:25] [ info] [storage] ver=1.5.1, type=memory, sync=normal, checksum=off, max_chunks_up=128
[2024/01/19 12:01:25] [ info] [cmetrics] version=0.6.6
[2024/01/19 12:01:25] [ info] [ctraces ] version=0.4.0
[2024/01/19 12:01:25] [ info] [input:systemd:systemd.0] initializing
[2024/01/19 12:01:25] [ info] [input:systemd:systemd.0] storage_strategy='memory' (memory only)
[2024/01/19 12:01:25] [debug] [systemd:systemd.0] created event channels: read=25 write=27
[2024/01/19 12:01:25] [debug] [input:systemd:systemd.0] jump to the end of journal and skip 0 last entries
[2024/01/19 12:01:25] [debug] [input:systemd:systemd.0] sd_journal library may truncate values to sd_journal_get_data_threshold() bytes: 65536
[2024/01/19 12:01:25] [debug] [stdout:stdout.0] created event channels: read=31 write=60
[2024/01/19 12:01:25] [ info] [sp] stream processor started
[2024/01/19 12:01:25] [ info] [output:stdout:stdout.0] worker #0 started
Expected behavior Systemd logs which occur after fluent-bit is launched are displayed on stdout.
Your Environment
- Version used: 2.2.2
- Configuration: (see above)
- Environment name and version (e.g. Kubernetes? What version?): Self-hosted
- Server type and version: Raspberry Pi 4B
- Operating System and version: NixOS 23.11
- Filters and plugins: none, built-in
Additional context
I wondered how NixOS's unique environment may be contributing to this. I built fluent-bit locally without issue. I have written other programs that run on this device which use the sd_journal libsystemd API without a problem.
Systemd version I am using:
# systemctl --version
systemd 254 (254.6)
+PAM +AUDIT -SELINUX +APPARMOR +IMA +SMACK +SECCOMP +GCRYPT -GNUTLS +OPENSSL +ACL +BLKID +CURL +ELFUTILS +FIDO2 +IDN2 -IDN +IPTC +KMOD +LIBCRYPTSETUP +LIBFDISK +PCRE2 -PWQUALITY +P11KIT -QRENCODE +TPM2 +BZIP2 +LZ4 +XZ +ZLIB +ZSTD +BPF_FRAMEWORK -XKBCOMMON +UTMP -SYSVINIT default-hierarchy=unified
EDIT:
This bug is also reproducible with the above steps on up-to-date Arch Linux, fluent-bit v2.2.0 and systemd 255.2-3-arch.
sad bump cause I fixed this in Jan :(
I am seeing the same issue with recent versions of systemd (v250). Turning off Read_From_Tail seems to be working.
@nareshku turning off Read_From_Tail isn't an option for me - it's going to lead to excessive duplicates and unnecessary traffic writing logs if/when fluent-bit restarts.
This 3yr old bug in upstream systemd documents the issue: https://github.com/systemd/systemd/issues/17662
That upstream bug isn't getting worked on, assumedly because this workaround is reliable and the cause is complex. It's accepted that the workaround is:
- Jump to journal tail
- Call
prev()to move the cursor back one - (Optionally wait for new entries, as we do in fluent-bit)
- Call
next()to read the new entry from tail.
Without step 2, you'll always get an empty journal entry object, even if using the fd wait correctly and there is actually a log in the journal to retrieve.
PR #8396 fixes this by implementing that documented workaround (ie, just adding the prev() call after seeking to tail) but it hasn't been merged yet.
@n-hass Actually, I tried this today and it works with Read_From_Tail set to On. What doesn't work for is our journal logs are in /var/log/journal and somehow fluent-bit doesn't not find it. And only works when I set Path /var/log/journal
@nareshku just to clarify, are you saying Read_From_Tail works for you in the current master branch as long as you specify the Path? I am not able to reproduce that with the following:
[INPUT]
Name systemd
Path /var/log/journal
Read_From_Tail On
[OUTPUT]
Name stdout
systemd version: systemd 255 (255.4)
This still collects no input from the journal.
FWIW, cosmo0920 confirmed this is an issue and that PR fixes it.
I had this same issue with the systemd input plugin and setting the DB configuration parameter alongside Read_From_Tail seemed to fix it. I am using YAML configuration with fluent-bit 2.2.2.
Interestingly, the documentation for the tail input plugin states that the DB configuration parameter is required for that plugin to read from the tail of target files, so I just applied that same logic to the systemd input plugin:
Note that if the database parameter DB is not specified, by default the plugin will start reading each target file from the beginning.
Thanks @eeowaa, but I wasn't able to reproduce with fluent-bit 3.0.2, systemd 255.4 and the following config:
[INPUT]
Name systemd
Read_From_Tail On
DB /tmp/fluent-bit-journald-cursor
[OUTPUT]
Name stdout
This still fails to retrieve any records from the journal for me, so there is no input. I suspect you may be using an earlier systemd version where this bug is not present?
@n-hass Sorry for the delay. I am using fluent-bit 2.2.2 and systemd 250.3.