python程序做dunn索引来评估集群性能,学习相关程序已写在某个网站上,需要计算集群之间的最小距离和一个集群中的最大距离:
import pandas as pd
import numpy as np
from sklearn.metrics.pairwise import euclidean_distances
...
def delta_fast(ck,cl,distances):
values = distances[np.where(ck)][:,np.where(cl)]
print(values)
def dunn_fast(points,labels):
distances = euclidean_distances(points)
print("distances")
print(distances)
print(distances.shape[0])
print(distances.shape[1])
ks = np.sort(np.unique(labels))
print("ks")
print(ks)
deltas = np.ones([len(ks),len(ks)]) * 1000000
big_deltas = np.zeros([len(ks),1])
l_range = list(range(0,len(ks)))
for k in l_range:
for l in (l_range[0:k] + l_range[k+1:]):
deltas[k,l] = delta_fast((labels == ks[k]),(labels == ks[l]),distances)
距离是数据帧(1406 * 1406) 但它错了:
Traceback (most recent call last):
File "F:/MyDocument/F/My Document/Training/Python/PyCharmProject/FaceBookCrawl/FB_group_user_dunnIndex.py", line 100, in <module>
get_group_members_cluster_info(cluster_method,cluster_number)
File "F:/MyDocument/F/My Document/Training/Python/PyCharmProject/FaceBookCrawl/FB_group_user_dunnIndex.py", line 89, in get_group_members_cluster_info
dunn_fast(cal_cluster_data_df,cluster_data_label_df)
File "F:/MyDocument/F/My Document/Training/Python/PyCharmProject/FaceBookCrawl/FB_group_user_dunnIndex.py", line 48, in dunn_fast
deltas[k,l] = delta_fast((labels == ks[k]),(labels == ks[l]),distances)
File "F:/MyDocument/F/My Document/Training/Python/PyCharmProject/FaceBookCrawl/FB_group_user_dunnIndex.py", line 12, in delta_fast
values = distances[np.where(ck)][:,np.where(cl)]
IndexError: too many indices for array
似乎这句话是错误的: values = distance [np.where(ck)] [:,np.where(cl)]
你可以告诉我原因以及如何解决它