Dmitry Kobak
Dmitry Kobak
Absolutely agree with @frankier that this would be a great additional feature. I am working on manifold learning / clustering, and what I am interested in is exactly the kNN...
I can confirm that PyNNDescent builds an actual kNN graph during build time. If you build it with `k` neighbors and later only want to have `k` (or fewer) neighbors,...
Thanks Leland, that's very helpful. Do you know if any other algorithms currently included in `ann-benchmarks` construct kNN graph when building the index?
Thanks again. All good points.
@fsvbach Can you maybe post the exact error you received before? You should see a specific line that used to trigger the error.
> I expect TSNE to return 4 distinct clusters (actually 4 points only). Sklearn yields this. Actually the points should not be on top of each other in the embedding....
Wow. None of that makes any sense to me! Unless I am missing something, the points should not overlap at all, so we should see 200 points separately. BH-openTSNE makes...
It seems that has nothing to do with `precomputed` specifically, but with how we treat identical points. ``` np.random.seed(42) X = np.concatenate(( np.array(list(np.random.randn(1,10))*50), np.array(list(np.random.randn(1,10))*50), np.array(list(np.random.randn(1,10))*50), np.array(list(np.random.randn(1,10))*50) )) Z = TSNE(negative_gradient_method='bh').fit(X)...
I realized that I was wrong when I said that the optimal embedding should not have overlapping points. On reflection, it should be possible to reduce KL to zero by...
Update: sklearn also shows negative KL values when using PCA initialization (but not with random initialization)! ``` Z = SklearnTSNE(verbose=2, init='pca').fit_transform(X) ``` results in ``` [t-SNE] Computing 91 nearest neighbors......