tslearn icon indicating copy to clipboard operation
tslearn copied to clipboard

Bug with Kshape clustering

Open pengfei123xiao opened this issue 6 years ago • 2 comments

Hi, I am using the 'kshape' method in your package. However, I have met a bug that when the 'n_clusters' is set to a large value, the 'fit_predict' method seems to return wrong results. For example, in the graph below I have set 'n_clusters' equals to 7,8,9,10 separately, but the number of the unique labels is different from the cluster number I set. My codes and results are attached. Please have a look. Many thanks. bug

pengfei123xiao avatar Apr 21 '19 14:04 pengfei123xiao

Hi, when I downgraded my tslearn version to 0.1.26, this error disappeared. Hope this information could help.

pengfei123xiao avatar Apr 23 '19 10:04 pengfei123xiao

This bug (also mentioned here: https://github.com/tslearn-team/tslearn/issues/439) is still present. I have been using the KShape function on all datasets in the UCR archive (https://www.cs.ucr.edu/~eamonn/time_series_data_2018/) and found that for the Adiac dataset, I could not get tslearn.clustering.KShape to return a partition with 37 clusters (the number of ground-truth classes). This is despite numerous attempts and with a large number of initialisations. When using the author's "kshape" function from https://github.com/TheDatumOrg/kshape-python/blob/main/kshape/core.py, in one initialisation I was able to obtain a partition with 37 clusters. It seems that in the tslearn implementation, cluster centers are coinciding and reducing the effective number of clusters for this particular dataset. Based on this comment from 2019, it appears that this isn't isolated to particular datasets.

yerbles avatar Oct 03 '23 06:10 yerbles