我有两个变量X和Y。
X的结构(即np.array):
Usage
$ lerna bootstrap
Bootstrap the packages in the current Lerna repo. Installs all of their dependencies and links any cross-dependencies.
When run, this command will:
npm install all external dependencies of each package.
Symlink together all Lerna packages that are dependencies of each other.
npm run prepublish in all bootstrapped packages (unless --ignore-prepublish is passed).
npm run prepare in all bootstrapped packages.
Y的结构:
prepublish
Y中的数组,例如[1252,26777,26831]是三个单独的特征。
我正在使用scikit学习模块中的Knn分类器
[[26777 24918 26821 ... -1 -1 -1]
[26777 26831 26832 ... -1 -1 -1]
[26777 24918 26821 ... -1 -1 -1]
...
[26811 26832 26813 ... -1 -1 -1]
[26830 26831 26832 ... -1 -1 -1]
[26830 26831 26832 ... -1 -1 -1]]
但是我收到一条错误消息:
ValueError:不支持multiclass-multioutput
我猜不支持'Y'的结构,为了使程序执行我需要进行哪些更改?
输入:
[[1252, 26777, 26831], [1252, 26777, 26831], [1252, 26777, 26831], [1252, 26777, 26831], [1252, 26777, 26831], [1252, 26777, 26831], [25197, 26777, 26781], [25197, 26777, 26781], [25197, 26777, 26781], [26764, 25803, 26781], [26764, 25803, 26781], [25197, 26777, 26781], [25197, 26777, 26781], [1252, 26777, 16172], [1252, 26777, 16172]]
预期输出:
classifier = KNeighborsClassifier(n_neighbors=3)
classifier.fit(X,Y)
predictions = classifier.predict(X)
print(accuracy_score(Y,predictions))
答案 0 :(得分:0)
如错误所述,KNN
不支持多输出回归/分类。
对于您的问题,您需要MultiOutputClassifier()
。
from sklearn.multioutput import MultiOutputClassifier
knn = KNeighborsClassifier(n_neighbors=3)
classifier = MultiOutputClassifier(knn, n_jobs=-1)
classifier.fit(X,Y)
工作示例:
>>> from sklearn.feature_extraction.text import TfidfVectorizer
>>> corpus = [
... 'This is the first document.',
... 'This document is the second document.',
... 'And this is the third one.',
... 'Is this the first document?',
... ]
>>> vectorizer = TfidfVectorizer()
>>> X = vectorizer.fit_transform(corpus)
>>> Y = [[124323,1234132,1234],[124323,4132,14],[1,4132,1234],[1,4132,14]]
>>> from sklearn.multioutput import MultiOutputClassifier
>>> from sklearn.neighbors import KNeighborsClassifier
>>> knn = KNeighborsClassifier(n_neighbors=3)
>>> classifier = MultiOutputClassifier(knn, n_jobs=-1)
>>> classifier.fit(X,Y)
>>> predictions = classifier.predict(X)
array([[124323, 4132, 14],
[124323, 4132, 14],
[ 1, 4132, 1234],
[124323, 4132, 14]])
>>> classifier.score(X,np.array(Y))
0.5
>>> test_data = ['I want to test this']
>>> classifier.predict(vectorizer.transform(test_data))
array([[124323, 4132, 14]])