HTTP output to Betterstack crashes with a BusError on FreeBSD 14.2 amd64
Bug Report
Describe the bug
My goal is to send logs from my FreeBSD host and any jails running on it to Betterstack. When I enable the HTTP output plugin, fluent-bit crashes with a Bus Error when it tries to send the message forward.
To Reproduce
- Steps to reproduce the problem:
The configuration file looks like this:
[SERVICE]
flush 1
log_level info
parsers_file parsers.conf
plugins_file plugins.conf
http_server Off
http_listen 0.0.0.0
http_port 2020
storage.metrics on
[INPUT]
tag syslog
name tail
path /var/log/messages
[INPUT]
tag siansaksa
name random
[OUTPUT]
match *
name stdout
format json_lines
[OUTPUT]
name http
match *
tls On
host in.logs.betterstack.com
port 443
uri /fluentbit
header Authorization Bearer XXXXXX # Token omitted for privacy
header Content-Type application/msgpack
format msgpack
retry_limit 5
I execute Fluent Bit with doas -u nobody /usr/local/bin/fluent-bit -c /usr/local/etc/fluent-bit/fluent-bit.conf.
The execution crashes with a Bus Error after the first random entry is generated:
[2024/12/13 12:18:45] [ info] [config] changing coro_stack_size from 3072 to 4096 bytes
Fluent Bit v3.2.2
* Copyright (C) 2015-2024 The Fluent Bit Authors
* Fluent Bit is a CNCF sub-project under the umbrella of Fluentd
* https://fluentbit.io/
______ _ _ ______ _ _ _____ _____
| ___| | | | | ___ (_) | |____ |/ __ \
| |_ | |_ _ ___ _ __ | |_ | |_/ /_| |_ __ __ / /`' / /'
| _| | | | | |/ _ \ '_ \| __| | ___ \ | __| \ \ / / \ \ / /
| | | | |_| | __/ | | | |_ | |_/ / | |_ \ V /.___/ /./ /___
\_| |_|\__,_|\___|_| |_|\__| \____/|_|\__| \_/ \____(_)_____/
[2024/12/13 12:18:45] [ info] [fluent bit] version=3.2.2, commit=, pid=23342
[2024/12/13 12:18:45] [ info] [storage] ver=1.5.2, type=memory, sync=normal, checksum=off, max_chunks_up=128
[2024/12/13 12:18:45] [ info] [simd ] disabled
[2024/12/13 12:18:45] [ info] [cmetrics] version=0.9.9
[2024/12/13 12:18:45] [ info] [ctraces ] version=0.5.7
[2024/12/13 12:18:45] [ info] [input:tail:tail.0] initializing
[2024/12/13 12:18:45] [ info] [input:tail:tail.0] storage_strategy='memory' (memory only)
[2024/12/13 12:18:45] [ info] [input:random:random.1] initializing
[2024/12/13 12:18:45] [ info] [input:random:random.1] storage_strategy='memory' (memory only)
[2024/12/13 12:18:45] [ info] [output:stdout:stdout.0] worker #0 started
[2024/12/13 12:18:45] [ info] [sp] stream processor started
[2024/12/13 12:18:45] [ info] [output:http:http.1] worker #0 started
[2024/12/13 12:18:45] [ info] [output:http:http.1] worker #1 started
{"date":1734085126.234562,"rand_value":6488732564125523264}
Bus error
Expected behavior
- Fluent will print out random log entries on console
- Same entries are visible in Betterstack
Your Environment
- Version used: 3.1.9 and 3.2.2
- Configuration: See above
- Operating System and version: FreeBSD 14.2 on amd64
Additional context
Since the shipper doesn't work for me, I've been forced to install Fluentd and it makes me unhappy.
Some wrangling with the core dump says that the problem is in the ares__slist_node_first function. The address for head seems to be rax = 0xd234bc1b34275b44, which should be aligned with 8 on amd64.
We managed to get it working if there's only one active source and only one active output, ranom values in this case. Once I write something to the syslog, the next random entry brings down the process again.
Mitigated the problem by making the coro stack 80k
Was going to say, we don't technically support it directly as a platform: https://docs.fluentbit.io/manual/installation/supported-platforms
I was assuming you were compiling it directly so we would need a lot more information about how/what you configured to do that but sounds like you sorted.
Initially I got it from the package system, i.e. as a prebuilt binary. See https://www.freshports.org/sysutils/fluent-bit/ for example. I did my own build for debugging purposes to get debugging symbols and address sanitizer. That build also used the setup from the ports system, so both cases built it the same way.
That build is unrelated to this project so we cannot support it.
Did you get it going then with the coro stack size change?
It's been running 15 hours now without crashing, keeping my fingers crossed :D
It might be worth adding to the general Raspbian builds then here: https://github.com/fluent/fluent-bit/blob/4d715c07d91ae8087a2bf1e6a185b3e95ac18914/packaging/distros/raspbian/Dockerfile#L63-L74
Hi!
I'm the "porter" for fluent-bit to FreeBSD. I guess, since you don't support the platform, I'm trying to do that for you. FreeBSD users are kind of used to this scenario. No problem.
I tried switching clang for gcc just to rule out problems related to clang, and the error persists with gcc as well.
Mitigated the problem by making the coro stack 80k
How do you do that. Thorugh configuration or in the build?
Configuration
So, perhaps We should just add a higher hard coded value for the coro stack? Check this code:
https://github.com/fluent/fluent-bit/blob/d77c06dc6cae0eefe534435eee0fb024e3dc1021/include/fluent-bit/flb_coro.h
#ifdef FLB_SYSTEM_MACOS
#ifdef __aarch64__
#define STACK_FACTOR 1.5 /* Use 36KiB for coro stacks */
#else
#define STACK_FACTOR 2 /* Use 24KiB for coro stacks */
#endif
#else
#define STACK_FACTOR 1
#endif
#ifdef FLB_CORO_STACK_SIZE
#define FLB_CORO_STACK_SIZE_BYTE FLB_CORO_STACK_SIZE
#else
#define FLB_CORO_STACK_SIZE_BYTE ((3 * STACK_FACTOR * PTHREAD_STACK_MIN) / 2)
#endif
Would it make sense adding something similar for BSD + amd64?
ping @arsatiki do you agree bumping the coro stack as per ☝️ ?
I've hard coded to the expected valiue according to some documentaion printouts. See: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=283299
The PMAX_STACK_MIN = 4 * 512 in FreeBSD's include/pthread.h so it seems way too low in comparison with the 24576 mentioned in the docs (https://docs.fluentbit.io/manual/administration/configuring-fluent-bit).
Would this be OK?
--- include/fluent-bit/flb_coro.h.orig 2024-12-30 22:32:11.000000000 +0100
+++ include/fluent-bit/flb_coro.h 2025-01-06 23:50:52.035541000 +0100
@@ -68,7 +68,11 @@
#define STACK_FACTOR 2 /* Use 24KiB for coro stacks */
#endif
#else
+#ifdef FLB_SYSTEM_FREEBSD
+#define FLB_CORO_STACK_SIZE 24576 /* FreeBSD's PTHREAD_STACK_MIN is just 2048 */
+#else
#define STACK_FACTOR 1
+#endif
#endif
#ifdef FLB_CORO_STACK_SIZE
ping @arsatiki do you agree bumping the coro stack as per ☝️ ?
Oops sorry, completely missed your original question because of Christmas 😅 That sounds like a reasonable approach to me. I originally used 20k as the stack size, but that wasn't enough. I then used 80k as mentioned above, but it seems 24576 works too. I'll let you know if it does crash later though 😄
@girgen It's still running so let's say 24576 is okay.
Excellent. @patrick-stephens is there a point in me making a pull request?
Would you consider recognising the FreeBSD port as some sort of "community supported distribution"? See https://github.com/freebsd/freebsd-ports/tree/main/sysutils/fluent-bit
I added the above patch there in https://github.com/freebsd/freebsd-ports/blob/main/sysutils/fluent-bit/files/patch-include__flb_coro.h and it is used already as per https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=283299
I mean it would be great to submit a patch to ensure it works here but you can also point people at the BSD side maybe from the docs? Maybe from https://github.com/fluent/fluent-bit-docs/blob/master/installation/supported-platforms.md?
Even though we do not officially build for FreeBSD there's no issue with having a patch to support it (as long as it does not break any other targets). Probably simplifies your downstream usage then as well.