如何将轮廓分数添加到列表

时间:2019-11-10 08:53:51

标签: python k-means silhouette

我想在循环中将轮廓分数添加到列表中。

        from sklearn.cluster import KMeans
        from sklearn.metrics import silhouette_score

        ks = range(1, 11) # for 1 to 10 clusters
        #sse = []
        sil = []

        for k in ks:
             # Create a KMeans instance with k clusters: model
             kmeans = KMeans(n_clusters = k)
             # Fit model to samples
             #kmeans.fit(X)
             cluster_labels = kmeans.fit_predict(X) #X is dataset that preprocess already.
             silhouette = silhouette_score(X, cluster_labels)


             # Append the inertia to the list of inertias
             #sse.append(kmeans.inertia_)

             #Append silhouette to the list
             sil.append(silhouette)

但是,当我用Silhouette_score设置轮廓时,在第21行出现以下错误

       ValueError                   Traceback (most recent call last)
       <ipython-input-12-2570ccf62502> in <module>()
       18     #kmeans.fit(X)
       19     cluster_labels = kmeans.fit_predict(X)
   --->20     silhouette = silhouette_score(X, cluster_labels)
       21 
       22 

2 个答案:

答案 0 :(得分:0)

这是全部代码还是部分代码?如果在此之前没有代码,则很明显在分配之前没有定义或使用X

因此,将行放在您分配X的位置,一切都会正常进行。

否则,请将完整的跟踪信息添加到错误中

答案 1 :(得分:0)

from sklearn.datasets import make_blobs
from sklearn.cluster import KMeans
from sklearn.metrics import silhouette_samples, silhouette_score
  
X, y = make_blobs(n_samples=500,
                  n_features=2,
                  centers=4,
                  cluster_std=1,
                  center_box=(-10.0, 10.0),
                  shuffle=True,
                  random_state=1) 
sil=[]
#start the cluster range from 2
range_n_clusters = range(2,10)

for n_clusters in range_n_clusters:
    clusterer = KMeans(n_clusters=n_clusters, random_state=10)
    cluster_labels = clusterer.fit_predict(X)
    silhouette_avg = silhouette_score(X, cluster_labels)
    print("For n_clusters =", n_clusters,
          "The average silhouette_score is :", silhouette_avg)
    sil.append(silhouette_avg)

这是将Kmeans聚类应用于随机样本并根据轮廓分数找到最佳聚类的示例。我认为这会为您提供帮助或提供更多信息 enter image description here