Issues with Implementation and Replicating Paper Results

Open its-sandy opened this issue 4 years ago • 1 comments

Hi there! I am working on a PyTorch implementation of SLIDE. I'm currently trying to compare its performance against SLIDE. I'm faced with a few doubts/issues while evaluating SLIDE, and need clarifications for the same.

I'm unable to replicate the accuracy vs iteration plot for Delicious 200K dataset using the paramters Simhash, K=9, L=50 mentioned in the paper (plot attached). I also observe that SLIDE's accuracy seems to worsen beyond a certain point. What could be the reasons for these?
I observe a few inconsistencies in the implementations of WTA and DWTA hashes.

https://github.com/keroro824/HashingDeepLearning/blob/3cebe6f99a5454bef6f241dea804e07e0d075484/SLIDE/LSH.cpp#L82 The hashes are combined as index += h<<((_K-1-j)(int)floor(log(binsize))); But, if the hashes are to simply be concatenated, shouldn't it instead be index += h<<((_K-1-j)(int)ceil(log2(binsize))); However, for binsize = 8, I also observe that shifting by floor(log(binsize)) = 2 bits gives better convergence than shifting by ceil(log2(binsize)) = 3 bits. Is this intentional? Why is this the case?
There appears to be a bug in WTA hash . https://github.com/keroro824/HashingDeepLearning/blob/3cebe6f99a5454bef6f241dea804e07e0d075484/SLIDE/WtaHash.cpp#L57

What is the reason behind using simhash for Delicious 200K and DWTA hash for Amazon 670K?
The paper had mentioned extension of SLIDE to convolution as a future direction. Has there been any progress along this line?

Feb 17 '21 11:02 its-sandy

@its-sandy Dear Mr. its-dandy,

I want ask you why you are work on PyTorch, and I want ask you if you get the same result in paper or not. I try to run the code along time and always give me Killed.

Also, Are you found weight and savedweight files because I can't find it. I very need to run the code .

May 06 '22 07:05 Eslam2011