两个不同矩阵中元素之间的欧几里德距离?

时间:2016-08-05 12:54:27

标签: python numpy

我正试图从他们的质心确定我的文件的欧几里德距离。有问题的两个数组(pointscenters)的维度满足XA的{​​{1}}和XB维度要求,但我不知道为什么我得到以下scipy.spatial.distance.cdist

我的代码:

ValueError

这是我得到的错误:

import pandas as pd, numpy as np
from scipy.spatial.distance import cdist
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.cluster import KMeans

corpus = pd.Series(["bye bye brutal good bye apple banana orange", "bye bye hello apple banana", "corn wheat apple banana goodbye cookie brutal", "fruit cake banana apple bye sweet sweet"])
X = vectorizer.fit_transform(corpus)
model = Kmeans(n_clusters = 2)
model.fit(X)
centers = model.cluster_centroids_

cdist(X, centers)

来自ValueError: setting an array element with a sequence. 的文档:

scipy.spatial.distance.cdist

我的Parameters: XA: ndarray An Ma by n array of Ma original observations in an n-dimensional space XB: ndarray An Mb by n array of Mb original observations in an n-dimensional space ... X centers数组肯定满足numpy的这些维度条件,对吗?我错过了什么?

1 个答案:

答案 0 :(得分:2)

您需要做一些小改动:

cdist(X.toarray(),centers)

由于X是scipy.sparse.csr.csr_matrix类型的对象,因此scipy函数不会直接将其作为有效输入。方法toarray()将其转换为有效的numpy数组