我尝试了以下代码来获取tfidf_matrix(这是一个术语文档频率矩阵)的肘曲线。但是我收到的错误如图所示。 [![在此输入图片说明] [1]] [1] 可以做些什么来解决这个问题?
from scipy.spatial.distance import cdist, pdist
from sklearn.cluster import KMeans
K = range(1,50)
KM = [KMeans(n_clusters=k).fit(tfidf_matrix) for k in K]
centroids = [k.cluster_centers_ for k in KM]
D_k = [cdist(tfidf_matrix, cent, 'euclidian') for cent in centroids]
cIdx = [np.argmin(D,axis=1) for D in D_k]
dist = [np.min(D,axis=1) for D in D_k]
avgWithinSS = [sum(d)/tfidf_matrix.shape[0] for d in dist]
# Total with-in sum of square
wcss = [sum(d**2) for d in dist]
tss = sum(pdist(tfidf_matrix)**2)/dt_trans.shape[0]
bss = tss-wcss
kIdx = 10-1
tfidf_matrix是我们从文档中获得的术语文档频率矩阵。 This is the error