无法使用neighbors.KNeighborsClassifier进行多标签稀疏数据

时间:2014-03-01 13:05:21

标签: python scikit-learn ml sparse-matrix knn

我的文件数据为:

1,2,3 4:5 6:7................
11,12,13,14 15:16 17:18 19:20......
.
.
.

我已将此文件加载为

X_train,Y_train = load_svmlight_file(filename, multilabel=True)

正在加载文件(对于此示例)

np.shape(X_train) = 2,3
np.shape(Y_train) = 2,    (list of tuples)

然后我将Y_train转换为稀疏矩阵(csr_matrix),并使用Python: Converting list of tuples(variable size) into array like structure with fixed shape

处给出的答案将此示例转换为大小(2,4)

但现在我正在使用:

 knn = neighbors.KNeighborsClassifier(n_neighbors=3, weights='distance')
 knn.fit(X_train, Y_train)
 y = knn.predict(X_test)    //X_test is similar to X_train only

出现以下错误:

Traceback (most recent call last):
  File "code5.py", line 40, in <module>
    main(sys.argv[1:])
  File "code5.py", line 36, in main
    y = knn.predict(X_test)
  File "/usr/lib/pymodules/python2.7/sklearn/neighbors/classification.py", line 123, in predict
    pred_labels = self._y[neigh_ind]
IndexError: 0-d arrays can't be indexed.

0 个答案:

没有答案