lognote icon indicating copy to clipboard operation
lognote copied to clipboard

something wrong on TAG parser

Open barrygu opened this issue 9 months ago • 2 comments

Incorrect tag parsing will cause tag filtering failure. It seems difficult to parse all the correct content. Some tags even conflict, but we hope to be as correct as possible.

  1. tag contains space

Image

Image

  1. looks that tag contains some special character like "]}" etc.

Image

Image

barrygu avatar Apr 15 '25 05:04 barrygu

I think the log tokenizing needs modification. Please attach the log for testing.

cdcsgit avatar Apr 15 '25 06:04 cdcsgit

can't provide the complete file, select some error lines for your testing I also found that many kernel log has the similar issue test.log

barrygu avatar Apr 15 '25 09:04 barrygu

Maybe this is related (sorry if it isn't). I've found that there's some error with default separator:

Image

When 0 is provided for PID and TID, it fails doing classification. IA said that there are no-breaking spaces in logcat, and provided this separator; [\s\u00A0]+|:(?=\s). It works with sample text in log format settings (copied some lines from logs) but fails when using in logs itself.

The lines I've tested in Log format settings:

05-19 11:18:43.463  1057 15016 I QC2C2DEngine: calcYSize: unsupported 
05-19 11:18:43.466     0     0 I KERNEL  : [131628.418459] (CPU:2-pid:410:init)
05-19 11:37:50.226     0     0 I KERNEL  : [132775.178316] (CPU:1-pid:13568:wk:pmic_glink_) 
11-20 23:29:26.908  1376  3136 E Test4  : This line is sample 4

Second and third are not equal, according to AI both differ because third has no-breaking (UTF-8: \u00A0) spaces.

Thanks for share your code.

kikecalpe avatar May 19 '25 11:05 kikecalpe

Maybe this is related (sorry if it isn't). I've found that there's some error with default separator:

Image

When 0 is provided for PID and TID, it fails doing classification. IA said that there are no-breaking spaces in logcat, and provided this separator; [\s\u00A0]+|:(?=\s). It works with sample text in log format settings (copied some lines from logs) but fails when using in logs itself.

The lines I've tested in Log format settings:

05-19 11:18:43.463  1057 15016 I QC2C2DEngine: calcYSize: unsupported 
05-19 11:18:43.466     0     0 I KERNEL  : [131628.418459] (CPU:2-pid:410:init)
05-19 11:37:50.226     0     0 I KERNEL  : [132775.178316] (CPU:1-pid:13568:wk:pmic_glink_) 
11-20 23:29:26.908  1376  3136 E Test4  : This line is sample 4

Second and third are not equal, according to AI both differ because third has no-breaking (UTF-8: \u00A0) spaces.

Thanks for share your code.

To maintain spacing when outputting logs, the " " character is converted to a "\u00A0" character. However, logcat log does not seem to contain a "\u00A0" character between each entry. Does your actual logcat output contain the "\u00A0" character?

"When 0 is provided for PID and TID, it fails doing classification" has been improved. You can check it in the latest source code.

There is an issue where parsing fails when there is no tag or the tag contains a space, and I am considering ways to improve this.

Thank you for your feedback.

cdcsgit avatar May 19 '25 14:05 cdcsgit

3f8227e

cdcsgit avatar Jul 13 '25 01:07 cdcsgit