对于k个聚类,如何计算从质心到聚类中每个点的mean_distances。
公式:
我的代码:
def mean_distances(k, X):
"""
Arguments:
k -- int, number of clusters
X -- np.array, matrix of input features
Returns:
Array of shape (k, ), containing mean of sum distances
from centroid to each point in the cluster for k clusters
"""
### START CODE HERE ###
mod = KMeans(X, k)
clusters, final_centrs = mod.final_centroids()
dist = []
for i in range(k):
d = np.sum(np.linalg.norm((clusters[i] - final_centrs[i, :])**2)).mean()
dist.append(d)
return dist
### END CODE HERE ###
但是它不能正常工作。 (不带scklearn的PS,只有麻木)
答案 0 :(得分:0)
您正在获取外部总和的每个元素(即每个内部总和)的均值,而不是外部总和的均值:
import numpy as np
from sklearn.cluster import KMeans
def mean_distances(k, X):
"""
Arguments:
k -- int, number of clusters
X -- np.array, matrix of input features
Returns:
Array of shape (k, ), containing mean of sum distances
from centroid to each point in the cluster for k clusters
"""
mod = KMeans(X, k)
clusters, final_centrs = mod.final_centroids()
dist = []
for i in range(k):
d = np.sum(np.linalg.norm((clusters[i] - final_centrs[i, :])**2))
dist.append(d)
return dist.mean()