mtail icon indicating copy to clipboard operation
mtail copied to clipboard

Parsing stops when hitting UTF-8 replacement char

Open matejzero opened this issue 3 years ago • 3 comments

We are using mtail to parse some specific logs where sometimes there are replacement chars(�) in strings. It seems like when mtail hits a log line with this char, it stops parsing logs after this log line.

mtail version: mtail version 3.0.0-rc50

mtail program:

counter response_total by code
counter count_lines
const ONELINER /oneliner/

ONELINER {
  /<result code=\"(?P<code>1000|1300|1301|1500|2002|2200|2201|2202|2302|2303|2306|2308|greeting)\">/ {
  response_total[$code]++
  }
}

/$/ {
  count_lines++
}

Sample logs - non-working:

Nov 29 04:02:02.242 host.example.tld service INFO  [18636C07-1:foobar:1.2.3.4] service.ioLoop 123 : Transaction took: 0.500 seconds
Nov 29 04:02:02.242 host.example.tld service INFO  Loop(responseData -> oneliner) <result code="1000"> <contact:name>FOO�BAR LOG</contact:name>
Nov 29 04:02:02.242 host.example.tld service INFO  [18636C07-1:foobar:1.2.3.4] service.ioLoop 123 : Transaction took: 0.500 seconds

Sample logs - working:

Nov 29 04:02:02.242 host.example.tld service INFO  [18636C07-1:foobar:1.2.3.4] service.ioLoop 123 : Transaction took: 0.500 seconds
Nov 29 04:02:02.242 host.example.tld service INFO  Loop(responseData -> oneliner) <result code="1000"> <contact:name>FOOSBAR LOG</contact:name>
Nov 29 04:02:02.242 host.example.tld service INFO  [18636C07-1:foobar:1.2.3.4] service.ioLoop 123 : Transaction took: 0.500 seconds

Running with -one_shot, with replacement char, I get count_lines{prog="test"} 2and without it,count_lines{prog="test"} 3`

Is this a known problem and how can I get more info on why is it failing?

cc: @smrekarm12

matejzero avatar Dec 15 '22 11:12 matejzero

This is definitely a bug, thanks for finding it!

The problem is probably with how the file reader handles UTF-8 runes, and probably it's getting stuck by not discarding this after failing to convert the rune.

Can you find the INFO logs for mtail when this happens? https://google.github.io/mtail/Troubleshooting.html#deployment-problems has the location of the INFO log.

jaqx0r avatar Jan 01 '23 05:01 jaqx0r

I ran mtail with -logtostderr, -one_shot and -v=2 and the output is attached in log files.

Non-working log file: mtail_debug_nonworking.log

Working log file: mtail_debug_working.log

From the logs, I can only see the working one having one more decode.go:52] sendline log line and the non-working log is 2 bytes bigger, probably due to utf-8 rune.

matejzero avatar Jan 01 '23 13:01 matejzero

Any news regarding this issue? Just so we know if we should find another solution or wait?

matejzero avatar Mar 01 '23 09:03 matejzero

I can't reproduce this on the latest build:

jaq% ./mtail -progs eg.mtail -logs nonworking.log --one_shot --logtostderr --vmodule=decode=2 --one_shot_format=prometheus
I0527 13:16:08.084292 2760023 main.go:119] mtail version v3.0.1-58-ge7d279d7-dirty git revision e7d279d75b1b90b6f76e5e770d62e12ce651ecd1 go version go1.23-20240419-RC02 cl/626470163 +7f76c00fc5 X:fieldtrack,boringcrypto go arch amd64 go os linux
I0527 13:16:08.084423 2760023 main.go:120] Commandline: ["./mtail" "-progs" "eg.mtail" "-logs" "nonworking.log" "--one_shot" "--logtostderr" "--vmodule=decode=2" "--one_shot_format=prometheus"]
I0527 13:16:08.084985 2760023 store.go:189] Starting metric store expiry loop every 1h0m0s
I0527 13:16:08.085462 2760023 export.go:237] Export loop disabled.
I0527 13:16:08.087469 2760023 runtime.go:187] Loaded program eg.mtail
I0527 13:16:08.087758 2760023 logstream.go:74] Parsed url as /home/jaq/src/mtail/nonworking.log
I0527 13:16:08.087976 2760023 decode.go:64] sendline
I0527 13:16:08.088000 2760023 decode.go:64] sendline
I0527 13:16:08.087975 2760023 filestream.go:278] signalling stop at next EOF
I0527 13:16:08.088384 2760023 decode.go:64] sendline
I0527 13:16:08.088478 2760023 tail.go:295] Tailing /home/jaq/src/mtail/nonworking.log
I0527 13:16:08.088602 2760023 tail.go:316] pollPattern(/home/jaq/src/mtail/nonworking.log): no polling loop in oneshot mode
I0527 13:16:08.088636 2760023 tail.go:367] No gc loop in oneshot mode.
I0527 13:16:08.088672 2760023 mtail.go:141] Listening on [::]:3903
I0527 13:16:08.088905 2760023 runtime.go:295] END OF LINE
I0527 13:16:08.088930 2760023 runtime.go:296] processed 3 lines
I0527 13:16:08.089543 2760023 vm.go:1059] VM "eg.mtail" finished
I0527 13:16:08.089647 2760023 mtail.go:154] Shutdown requested.
# HELP count_lines defined at eg.mtail:2:9-19
# TYPE count_lines counter
count_lines{prog="eg.mtail"} 3
# HELP response_total defined at eg.mtail:1:9-22
# TYPE response_total counter
response_total{code="1000",prog="eg.mtail"} 1

I get the same results from both of the example files you posted at the start.

jaqx0r avatar May 27 '24 12:05 jaqx0r

Great info. I will pass that forward and reopen a ticket if it's not solved in the new version.

Thanks!

matejzero avatar May 28 '24 07:05 matejzero