我正在尝试读取csv文件并应用k-means算法来识别元素组。
我的代码是:
import csv
import numpy as np
import scipy as sp
from sklearn import cluster as sk
print(sk.k_means(np.genfromtxt('keywords.csv', delimiter=' ')[:,:0],3))
我使用genfromtxt
,因为有一些缺失的值,我可以绕过这些。
目前我希望看到k_means
函数完全返回,但我得到了
/anaconda/lib/python3.6/site-packages/numpy/core/_methods.py:59: RuntimeWarning: Mean of empty slice.
warnings.warn("Mean of empty slice.", RuntimeWarning)
/anaconda/lib/python3.6/site-packages/numpy/core/_methods.py:70: RuntimeWarning: invalid value encountered in double_scalars
ret = ret.dtype.type(ret / rcount)
Traceback (most recent call last):
File "ejercicio2.py", line 6, in <module>
print(sk.k_means(np.genfromtxt('keywords.csv', delimiter=' ')[:,:0],3))
File "/anaconda/lib/python3.6/site-packages/sklearn/cluster/k_means_.py", line 345, in k_means
x_squared_norms=x_squared_norms, random_state=random_state)
File "/anaconda/lib/python3.6/site-packages/sklearn/cluster/k_means_.py", line 388, in _kmeans_single_elkan
X = check_array(X, order="C")
File "/anaconda/lib/python3.6/site-packages/sklearn/utils/validation.py", line 424, in check_array
context))
ValueError: Found array with 0 feature(s) (shape=(3312, 0)) while a minimum of 1 is required.
答案 0 :(得分:1)
您通过编写[:, :0]
传递所有行但没有列,因此错误。您可能希望发送所有行和列,在这种情况下,只需从该行中删除它。通常语法是 -
data[x:y, a:b]
这意味着,从x到y的行(不包括)和从a到b的列(不包括)。