incdbscan icon indicating copy to clipboard operation
incdbscan copied to clipboard

Why does the algorithm more faster in the high dimensional?

Open Tsepu opened this issue 2 years ago • 3 comments

Why does the algorithm faster in the high dimensional?

I tried the algorithm using several cases (1000 points with 2-8-dimensional). It returns results faster in low dimensional than high dimensional. Is there any reason?

Thanks

Tsepu avatar Apr 20 '23 07:04 Tsepu

Several things can explain this depending on your data and the eps parameter you used. As the number of dimensions increases, distance between data points change. Thus, "being within eps distance" gets another meaning, which can e.g., heavily influence the calculation cost of connected component search during insertion.

DataOmbudsman avatar Oct 12 '23 20:10 DataOmbudsman

IncDBSCAN is very slow when using about 70w points with 1024 dimension. The distance metric is cosine and eps is set to 0.12. Is there any solution? thanks.

KwaiYii-Center avatar May 22 '24 08:05 KwaiYii-Center