hdbscan
hdbscan copied to clipboard
A high performance implementation of HDBSCAN clustering.
Hi I got the below error while running the hdbscan test. is there any significant issues? Appreciate your help. thank you PS C:\Users\K19067372\Documents> nosetests -s hdbscan ............................................EE..........E ====================================================================== ERROR: hdbscan.tests.test_hdbscan.test_hdbscan_is_sklearn_estimator...
Getting the following error while trying to install hdbscan using pipenv. I am using python version 3.8 and pip version 21 Building wheels for collected packages: hdbscan Building wheel for...
I have done clustering using hdbscan, everything is working. I wanted to do evaluation/validation of the clusters now with hyperparameter tuning with the following code: The matrix passed is a...
Hi there, I was wondering if HDBSCAN is deterministic or not. If its behavior is not deterministic, it would be relevant to add a random seed to initialize and control...
Hi, Using HDBSCAN to cluster network traffic data. I see that there is a way to retrieve the exemplar points representing the clusters. Would it be possible to instead return...
Thank you for your help! I want to apply HDBSCAN for the problem of entity resolution. Therefore, the number of clusters scales linearly with the number of samples. For example,...
cluster tree nodes (not nodes from the raw data) should be excluded when extracting exemplars during initializing or modifying PredictionData in https://github.com/scikit-learn-contrib/hdbscan/blob/94744a5715a639ecb084e803f96ddf6c909c3e07/hdbscan/flat.py#L789-L802 https://github.com/scikit-learn-contrib/hdbscan/blob/94744a5715a639ecb084e803f96ddf6c909c3e07/hdbscan/prediction.py#L134-L143 When cluster tree nodes are part of...
Given a HDBSCAN clustering, we'd like to merge some of the clusters to produce parent clusters. The ultimate goal is to have two-level clustering. A promising approach would be to...
Hi, the codes returns the stablest clusters with ``` hdb = hdbscan.HDBSCAN(gen_min_span_tree=True) hdb.fit(X) hdb.labels_ ``` The cluster_id's are from [-1:n] with n the number of clusters But if we want...
[china_df.csv](https://github.com/scikit-learn-contrib/hdbscan/files/8907906/china_df.csv)  I am getting erroneous clustering results , as you can see from above picture ,the cluster no 223 includes points that are from other side of the country....