LogPPT icon indicating copy to clipboard operation
LogPPT copied to clipboard

Running on raw log data

Open mykolav95 opened this issue 2 years ago • 11 comments

Is proposed solution intended to work on raw log data?

mykolav95 avatar Jun 04 '23 15:06 mykolav95

@mykolav95 @vanhoanglepsa

How to test on raw log and how to produce output in meaningful manner not like just accuracy mentioned?

khawar-islam avatar Jun 21 '23 02:06 khawar-islam

@khawar-islam Hi. Thanks for answering. Yes, exactly. I've had a quick grasp over the code. Seems to be that input data must be at least labeled for timestamp, severit, content, is that so?

mykolav95 avatar Jun 22 '23 13:06 mykolav95

Dear @mykolav95

How we can generate a label data that contain timestamp, severit and content?. We have raw log data so how we can process it?

khawar-islam avatar Jun 23 '23 00:06 khawar-islam

@khawar-islam I might be absolutly wrong then. Can you help me run the model on raw data?

mykolav95 avatar Jun 26 '23 08:06 mykolav95

Hi @khawar-islam , @mykolav95 LogPPT can work on both raw log data and semi-structured log data (with log headers). Please refer to this to define the header for log data. If your log data doesn't contain any header, just set log_format = "<Content>".

vanhoanglepsa avatar Jul 07 '23 06:07 vanhoanglepsa

Hi @vanhoanglepsa Can you describe how to use it? Like what code to keep and what to get rid of

sxj19980519 avatar Sep 16 '23 10:09 sxj19980519

@sxj19980519 could you provide some examples of your data? basically, you will need to define header fields and some pair of (log content, log template) to run LogPPT.

vanhoanglepsa avatar Sep 26 '23 04:09 vanhoanglepsa

@sxj19980519 could you provide some examples of your data? basically, you will need to define header fields and some pair of (log content, log template) to run LogPPT.

Thank you for your reply,I have alread solved this problem.But when I switch model from roberta to Bert.The terminal report “RuntimeError: The size of tensor a (1576) must match the size of tensor b (512) at non-singleton dimension 1”.And I found some special long logs casued this problem. For example, data with an index of 1580 in the dataset(HDFS_2k.log_structured)

sxj19980519 avatar Sep 27 '23 08:09 sxj19980519

@sxj19980519 would you please provide the way that you solved the problem, I am trying to run it on my own raw log file but it does not work because I do not have the labeled data

Tasneem91 avatar Oct 30 '23 12:10 Tasneem91

@sxj19980519 would you please provide the way that you solved the problem, I am trying to run it on my own raw log file but it does not work because I do not have the labeled data

Could you resolve the problem? I have raw data, but for running the fewshot sampling, it required already structured or parsed data. I don't see the purpose for using another log parsing method if LogPPT is esentially made for log parsing :?

rustamtemirov avatar Nov 28 '23 14:11 rustamtemirov

@vanhoanglepsa Hi Van, could you give response on how to create the structured and structured_corrected csv files?

rustamtemirov avatar Nov 28 '23 14:11 rustamtemirov