tlsh
tlsh copied to clipboard
HAC-T clustering is very slow with larger data size, for 500K tlsh list it took ~6 hours
Hi The HAC-T clustering for 500 K TLSH list took 6 hours, but The paper claimed it took ~ 2hours 10 min for 10 million samples (HAC-T and Fast Search for Similarity in Security --- chrome-extension://efaidnbmnnnibpcajpcglclefindmkaj/viewer.html?pdfurl=https%3A%2F%2Ftlsh.org%2FpapersDir%2FCOINS_2020_camera_ready.pdf&clen=191519&chunk=true )
Please help me how you achieved this faster clustering, Does it support multi threading
My experiment:
Data: 500 K tlsh input
Command: python hac-t.py -f
Thanks