Option to ignore missing timestamp/level in log or ignore not matched line
Is your feature request related to a problem? Please describe. I have some files which have 2/3 log files version in one file (log message, http message and some other) and currently when I tried to match file it does not seem to work and usually it loads empty log format. Also some lines do have timestamp/level some do not as they need to be quickly visible that they do not belong to each other.
INFO:inout.optimio:Loading elements ....
http://1.1.1.1:2000/
1.1.1.1:33314 - - [05/Oct/2021 14:53:59] "HTTP/1.1 GET /json/api1" - 200 OK
WARNING:area: cannot use object for loading
ERROR:area2:Cannot find object in current space
1.1.1.1:33404 - - [05/Oct/2021 14:54:01] "HTTP/1.1 GET /" - 200 OK
1.1.1.1:23428 - - [05/Oct/2021 14:54:01] "HTTP/1.1 GET /static/bootstrap/css/bootstrap.css" - 200
1.1.1.1:23430 - - [05/Oct/2021 14:54:01] "HTTP/1.1 GET /static/bootstrap/css/override.css" - 200
Describe the solution you'd like Some option how to load file where user could specify that some fields could be missing (maybe even take last successful match?) Another option could be ignore line which is not matched to any regex? I believe this is what already happens, when format is not found?
Describe alternatives you've considered I understand, that they should be two files or same format, but this has big reasoning why it its this way. I have tried to put two different regex targets, but it sems that when they do not have timestamp/level it does not matter.
So I was trying to use format and found maybe where is problem:
^(?<c_ip>[0-9]{1,3}\\.[0-9]{1,3}\\.[0-9]{1,3}\\.[0-9]{1,3})\\:(?<c_port>[0-9]{3,6}) \\-\\ \\-\\ \\[(?<timestamp>\\d\\d\\/[a-zA-Z]{2,5}\\/\\d\\d\\d\\d\\ \\d\\d\\:\\d\\d:\\d\\d)\\]\\ \"HTTP\\/1\\.1\\ (?<method>GET|POST|OPTIONS|PUT|HEAD)\\ (?<url>[a-zA-Z0-9\\.\\/\\_].+)\"\\ \\-\\ (?<code>\\d\\d\\d)?\\ ?(?<status>[a-zA-Z\\ ]{0,20})$"
Where I got match like this one > where it somehow gets confused
│1.1.1.1:59928 - - [06/Oct/2021 12:09:33] "HTTP/1.1 GET /settings" - 200 OK │
Received Time: 2021-10-06T12:09:33.000 -- 4 days ago │
Pattern: oc_log/regex/http_req = ^(?<c_ip>[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3})\:(?<c_port>[0-9]{3,6}) \-\ \-\ \[(?<timestamp>\d\d\/[a-zA-Z]{2,5}\/\d\d\d\d\ \d\d\:\d\d:\d\d)\]\ "HTTP\/1\.1\ (│
No known message fields │
Discovered fields for logline table from message format: #:#INFO:#__main__:#INFO:#__main__:# │
├ col_0 = 1.1.1.1 │
├ col_1 = 59928 - - [06/Oct/2021 12:09:33] "HTTP/1.1 GET /settings" - 200 OK │
├ INFO = │
├ __main__ = Parameters are same for VALUE2 │
├ INFO_0 = │
└ __main___0 = Parameters are same for VALUE3 │
│INFO:__main__:Parameters are same for VALUE2
Line up seems to be correct:
1.1.1.1:59848 - - [06/Oct/2021 12:09:30] "HTTP/1.1 GET /path2" - 200 │
Received Time: 2021-10-06T12:09:30.000 -- 4 days ago │
Pattern: oc_log/regex/http_req = ^(?<c_ip>[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3})\:(?<c_port>[0-9]{3,6}) \-\ \-\ \[(?<timestamp>\d\d\/[a-zA-Z]{2,5}\/\d\d\d\d\ \d\d\:\d\d:\d\d)\]\ "HTTP\/1\.1\ (│
Known message fields for table oc_log: │
├ c_ip = 1.1.1.1 │
├ timestamp = 06/Oct/2021 12:09:30 │
├ method = GET │
├ url = /path2 │
├ code = 200 │
├ status =
Log file to test:
1.1.1.1:59848 - - [06/Oct/2021 12:09:30] "HTTP/1.1 GET /path2" - 200
1.1.1.1:59928 - - [06/Oct/2021 12:09:33] "HTTP/1.1 GET /settings" - 200 OK
INFO:__main__:Parameters are same for VALUE2
INFO:__main__:Parameters are same for VALUE3
1.1.1.1:59950 - - [06/Oct/2021 12:09:33] "HTTP/1.1 GET /path3" - 303 See Other
INFO:__main__:Parameters are same for DONT_ITER
INFO:webapp.web_db:Trying to find another element ITEM3
Is it possible that even this does not work?
Can you attach the full format file.
This one already has some changes
{"oc_log": {
"title": "Python log format",
"description": "Log format used by python logging",
"url": "",
"regex": {
"main_log": {
"pattern": "^(?<level>\\w+)\\:(?<module>[a-zA-Z0-9\\.\\_]+)\\:(?<body>.*)?(?<timestamp>)"
},
"http_req": {
"pattern": "^(?<c_ip>[0-9]{1,3}\\.[0-9]{1,3}\\.[0-9]{1,3}\\.[0-9]{1,3})\\:(?<c_port>[0-9]{3,6}) \\-\\ \\-\\ \\[(?<timestamp>\\d\\d\\/[a-zA-Z]{2,5}\\/\\d\\d\\d\\d\\ \\d\\d\\:\\d\\d:\\d\\d)\\]\\ \"HTTP\\/1\\.1\\ (?<method>GET|POST|OPTIONS|PUT|HEAD)\\ (?<url>[a-zA-Z0-9\\.\\/\\_].+)\"\\ \\-\\ (?<code>\\d\\d\\d)?\\ ?(?<status>[a-zA-Z\\ ]{0,20})\\n?$"
}
},
"timestamp-format" : ["%d/%b/%Y %H:%M:%S"],
"level": {
"error": "ERROR",
"warning": "WARNING",
"debug": "DEBUG",
"info": "INFO"
},
"opid-field" :"c_ip",
"value": {
"c_ip": {
"kind": "string",
"collate": "ipaddress",
"identifier": true,
"description": "The client IP address"
},
"timestamp": {
"kind": "string"
},
"level": {
"kind": "string"
},
"method": {
"kind": "string"
},
"url": {
"kind": "string"
},
"code": {
"kind": "string"
},
"status": {
"kind": "string"
},
"module": {
"kind": "string"
},
"body": {
"kind": "string"
}
},
"sample": [
{"line": "1.1.1.1:53906 - - [23/May/2021 14:08:43] \"HTTP/1.1 GET /products/planview.js\" - 200 OK"},
{"line": "1.1.1.1:53898 - - [23/May/2021 14:08:43] \"HTTP/1.1 GET /static/bootstrap/css/bootstrap.css\" - 200 "},
{"line":"1.1.1.1:33092 - - [23/May/2021 14:11:33] \"HTTP/1.1 GET /json/usersettings/PRODFUNC\" - 200 OK "}
]
}
}
Try adding "multiline": false to your format. Let me know if that's the behavior you're looking for.
Hi, thanks for suggestion. It does not seem to change behavior. Still some lines usually before another format are not recognized. It just says that pattern is mached but, no know message fields and shows next line.
Pattern: oc_log/regex/http_req = ^(?<c_ip>[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3})\:(?<c_port>[0-9│
No known message fields │
Discovered fields for logline table from message format:
So I was again looking on this multiline and with -C there is difference:
lnav -C /tmp/log2
error:/tmp/log2:2:line did not match format oc_log/regex/http_req2
error:/tmp/log2:2: line -- INFO:__main__:Parameters are same for VALUE2
error:/tmp/log2:2:no partial match found
error:/tmp/log2:3:line did not match format oc_log/regex/http_req2
error:/tmp/log2:3: line -- INFO:__main__:Parameters are same for VALUE3
error:/tmp/log2:3:no partial match found
error:/tmp/log2:5:line did not match format oc_log/regex/http_req2
error:/tmp/log2:5: line -- INFO:__main__:Parameters are same for DONT_ITER
error:/tmp/log2:5:no partial match found
error:/tmp/log2:6:line did not match format oc_log/regex/http_req2
error:/tmp/log2:6: line -- INFO:webapp.web_db:Trying to find another element ITEM3
error:/tmp/log2:6:no partial match found
error:/tmp/log2:9:line did not match format oc_log/regex/http_req2
error:/tmp/log2:9: line -- INFO:webapp.web_db:Trying to find another element ITEM3
error:/tmp/log2:9:no partial match found
But now it does not match it directly. So I guess this is real reason? (as I should not have 2 not compatible formats?)
Update so with latest version 0.10 -C freezes. But without -C it seems to be able somehow work, not recognized lines are now logline and all others are recognized. So this seems to be working better. If it would take another regex it would be best ;)
The -C option is to help with checking a format to make sure it is correct. In this case, you have log messages without timestamps that you want ignored, which is what's happening with "multiline": false. Note that when I open it in the normal view, the lines without timestamps are highlighted and they have an "Invalid log message" indicator that is shown:

You can also do a SELECT * FROM oc_log to see that the timestamped log messages are interpreted correctly and the invalid log messages have a log_level of invalid.
Are you not seeing that?
Note that when multiline is set to true, the lines after a log message with a timestamp is considered to be all one log message. Due to a quirk in how lnav works, the log lines are initially matched against patterns individually and then the pattern is applied against all lines for a given message. In your case, the pattern matches a single line just fine. But, it does not match the multi-line log message, which gives the weird results.
So, I'm not sure what to do with this bug. I think the handling of cases where the pattern does not match a multi-line message needs to be improved. But, otherwise, I'm not sure what else you'd like to see here.
Well for me -C with 0.10 (compiled binary from github) version takes a lot time...(gui does not show up even after 1minute, 100% cpu).
Anyway what I meant was that another log format is not used right? Is it possible this could be somehow relaxed, that you would specify format (1 or 2) and it would use them? I mean maybe if whole format is used it would use just specified regexex? But I guess problem is sqlite table as SELECT * FROM oc_log is for whole and does not say http_req vs main_log. So if I undertand correctly I cannot use 2 completly diffrent regexes right? Would it be possible with multiline:false?
1) lnav -i format_http.json
2) lnav /tmp/logfile
3) rm (lnav dir)format_http.json
4) lnav -i format_message.json
5) lnav /tmp/logfile
This way I could load data via both formats from one file? Or I would still have problem with timestamp/level missing ?
And yes in 0.10 i am now seeing right that. (some valid and some invalid;) In 0.9 it did not work. Btw in another Issue I found some bug with ncurses/input/gui in 0.10.
Ok so now in beta 0.10 with this log format from up and log file with 20MB I get a lot of freezing, sometimes text loading with 13%-14% in green shows up and then I have to kill it.
Also in start I had to press Esc in start several times to be able to move. After while I was not able to move any more (up/down). Had to kill it. 0.9 was okay with that file, but probably it did not parse it at all.
Is it possible, that these two loglines will just not work at all? If this is true could point me to correct directions of source files, so I will try to look what could be wrong?
For example I got to end of file it was parsing it okay and after while only not valid lines were in view. After while I was not able to move up (file still continues) and it finished at 50 lines up with 100% CPU, 1 minute did not help to move it up. So something is telling me, that my regexp is wrong or I cannot have files with these two log lines inside.
Here is log from moving up and then it got 100% CPU. Not sure if actually it is in log. I just set P and moved up with arrows. lnav_debug.zip
Hi, so I have again tried current release and it certainly works better (still some crashes, but much less) But problem with multiple formats in log file seems to be still same. Is there a way how to force lnav to use more log formats and only matched line & matched format use for actual line and for next one use another format?