LogADEmpirical icon indicating copy to clipboard operation
LogADEmpirical copied to clipboard

Missing HDFS.log_structured.csv file

Open alishan2040 opened this issue 3 years ago • 2 comments

Hello, I was trying to run deeplog on hdfs dataset but ended up with the following error.

image

command I used: !python main_run.py --folder=bgl/ --log_file=HDFS.log --dataset_name=hdfs --model_name=deeplog --window_type=sliding\ --sample=sliding_window --is_logkey --train_size=0.8 --train_ratio=1 --valid_ratio=0.1 --test_ratio=1 --max_epoch=100\ --n_warm_up_epoch=0 --n_epochs_stop=10 --batch_size=1024 --num_candidates=150 --history_size=10 --lr=0.001\ --accumulation_step=5 --session_level=hour --window_size=60 --step_size=60 --output_dir=experimental_results/demo/random/ --is_process

Are we supposed to run other scripts first to generate such files (for example data_loader.py or synthesize.py) Can we re-run the code with other formats of HDFS dataset which are publicly available? Thanks,

alishan2040 avatar Jul 28 '22 21:07 alishan2040

Do you solve this problem? How can we get this structured.csv?

X-zhihao avatar Aug 13 '22 10:08 X-zhihao

You can use logparser(can be found in github) to preprocess HDFS dataset, and it can generate HDFS.log_structured.csv

wangwenjing1999 avatar Aug 26 '22 12:08 wangwenjing1999