TopicTuner icon indicating copy to clipboard operation
TopicTuner copied to clipboard

Curious about why tmt.reduce() method is faster than Bertopic's original UMAP method?

Open evelynxyx opened this issue 2 years ago • 0 comments

For the same docs, my dimensionality reduction in Bertopic costed 1.5 hour but tmt.reduce() only costed 10 more mins.

The following is the output of tmt.reduce():

UMAP(angular rp forest=True, metric='cosine, min dist=0.0, n components=5,n neighbors=5,random state=473921,verbose=2 Wed Jan 1 00:40:47 2024 Construct fuzzy simplicial set Wed Jan 10 00:40:48 2024 Finding Nearest Neighbors Wed Jan 10 00:40:48 2024 Building Rp forest with 37 trees Wed Jan 10 00:41:012024 NN descent for 19 iterations 1 / 19 2 /19 3 / 19 4/19 Stopping threshold met -- exiting after 4 iterations Wed Jan 10 00:41:30 2024 Finished Nearest Neighbor Search Wed Jan 10 00:41:34 2024 Construct embedding Epochs completed:0% 0/209[00:091 completede/200 epochs completed200 epochs29/ 40200 epochscompleted 60200 epochscompleted completed80200 epochs completed100200 epochs completed120200 epochs- completed140200 epochs completed160200 epochs1 completed 180200 epochs Wed Jan 10 00:54:18 2024 Finished embedding

evelynxyx avatar Jan 15 '24 17:01 evelynxyx