sklearn-数组的KMeans:ValueError:设置具有序列的数组元素

时间:2019-02-01 14:32:03

标签: scikit-learn k-means

我正在尝试通过多维特征进行KMeans Clusterin。我得到ValueError:用序列设置数组元素。

以下是我已经尝试过的示例:

import pandas as pd
from sklearn.cluster import KMeans

test = pd.DataFrame(np.random.randint(low=0, high=10, size=(30, 4)), columns=['a', 'b', 'c', 'd'])
test["combined1"] = test.loc(axis=1)["a","b"].values.tolist()
test["combined2"] = test.loc(axis=1)["c","d"].values.tolist()
test.drop(['a', 'b', 'c', 'd'],axis=1, inplace=True)
test.head()

kmeans = KMeans(n_clusters=3, random_state=0)
kmeans.fit(test)

KMeans拟合失败,

/usr/local/lib/python3.5/dist-packages/numpy/core/numeric.py in asarray(a, dtype, order)
    490 
    491     """
--> 492     return array(a, dtype, copy=False, order=order)
    493 
    494 

ValueError: setting an array element with a sequence.

1 个答案:

答案 0 :(得分:0)

因此,您将序列传递到KMeans中(例如[8, 1]),这就是为什么它不起作用的原因。请在此处检查:

https://scikit-learn.org/stable/modules/generated/sklearn.cluster.KMeans.html#sklearn.cluster.KMeans.fit

fit()方法允许您使用:

X:类似数组或稀疏矩阵,形状=(n_samples,n_features)