LogPPT
LogPPT copied to clipboard
ValueError: Sample larger than population or is negative
Dear @vanhoanglepsa
When I am running python3 fewshot_sampling.py, It gives an error on Hadoop log.
HDFS
{'log_file': 'HDFS/HDFS_2k.log', 'log_format': '<Date> <Time> <Pid> <Level> <Component>: <Content>'}
Hadoop
{'log_file': 'Hadoop/Hadoop_2k.log', 'log_format': '<SessionId> <Date> <Time> <Level> \\[<Process>\\] <Component>: <Content>'}
Traceback (most recent call last):
File "fewshot_sampling.py", line 48, in <module>
samples_ids = adaptive_random_sampling(shuffle(content), shot)
File "/media/cvpr/CM_1/LogPPT/logppt/sampling/__init__.py", line 56, in adaptive_random_sampling
candidate_set = [(x, logs[x]) for x in range(len(logs)) if x in random.sample(range(len(logs)), n_candidate)]
File "/media/cvpr/CM_1/LogPPT/logppt/sampling/__init__.py", line 56, in <listcomp>
candidate_set = [(x, logs[x]) for x in range(len(logs)) if x in random.sample(range(len(logs)), n_candidate)]
File "/home/cvpr/anaconda3/envs/timm_tutorials/lib/python3.7/random.py", line 321, in sample
raise ValueError("Sample larger than population or is negative")
ValueError: Sample larger than population or is negative
Hi @khawar-islam
The error is caused by some modifications we made for other experimental settings. We've updated a version to fix this issue. Please find the new sampling code here. Please update to the latest version to fix this issue.
@vanhoanglepsa Hello and thank you for your great work, would you please tell me how I can run fewshot sampling.py on my own raw log file