蟒蛇,sklearn:' dict'使用GridSearchCV和SVC无法调用object

时间:2015-06-11 21:21:16

标签: python scikit-learn

我尝试使用GridSearchCV来优化分类器svm.SVC的参数(均来自sklearn)。

from sklearn.grid_search import GridSearchCV
from sklearn.svm import SVC
from sklearn.metrics import confusion_matrix
import numpy as np

X_train = np.array([[1,2],[3,4],[5,6],[2,3],[9,4],[4,5],[2,7],[1,0],[4,7],[2,9])
Y_train = np.array([0,1,0,1,0,0,1,1,0,1])
X_test = np.array([[2,4],[5,3],[7,1],[2,4],[6,4],[2,7],[9,2],[7,5],[1,6],[0,3]])
Y_test = np.array([1,0,0,0,1,0,1,1,0,0])
parameters = {'kernel':['rbf'],'C':np.linspace(10,100,10)}
clf1 = GridSearchCV(SVC(), parameters, verbose = 10)
clf1.fit(X_train, Y_train)
cm = confusion_matrix(Y_test, clf1.predict(X_test))
bp = clf1.best_params_

输出显示它完成GridSearchCV,但随后它抛出错误:

Traceback (most recent call last):
File "<ipython console>", line 1, in <module>
File "C:\Python27\lib\site-packages\spyderlib\widgets\externalshell\sitecustomize.py", line 479, in runfile
execfile(filename, namespace)
File "I:\setup\Desktop\Stats\FinalProject.py", line 112, in <module>
clf1 = GridSearchCV(SVC(), parameters, verbose = 10)
TypeError: 'dict' object is not callable

1 个答案:

答案 0 :(得分:0)

当我运行您发布的代码时:

from sklearn.grid_search import GridSearchCV
from sklearn.svm import SVC
from sklearn.metrics import confusion_matrix
import numpy as np

X_train = np.array([[1,2],[3,4],[5,6]])
Y_train = np.array([0,1,0])
X_test = np.array([[2,4],[5,3],[7,1]])
Y_test = np.array([1,0,0])
parameters = {'kernel':['rbf'],'C':np.linspace(10,100,10)}
clf1 = GridSearchCV(SVC(), parameters, verbose = 10)
clf1.fit(X_train, Y_train)
cm = confusion_matrix(Y_test, clf1.predict(X_test))
bp = clf1.best_params_

我收到此错误:

  

文件&#34; C:\ Anaconda \ lib \ site-packages \ sklearn \ svm \ base.py&#34;,第447行,在_validate_targets中       %len(cls))   ValueError:类的数量必须大于1;得到1

由于列车数据由3个样本组成,当GridSearchCV将数据分成3倍时(BTW你可以控制这个参数,它被称为cv)。

e.g。 -

fold1 = [1,2]  ,   label1 = 0
fold2 = [3,4]  ,   label2 = 1
fold3 = [5,6]  ,   label3 = 0

现在,在某些迭代中,需要第一次和第三次折叠训练,第二次折叠用于验证。 请注意,这些训练折叠只包含一种标签! (标签0)因此打印出错误。

如果我以这种方式创建数据:

X, Y = datasets.make_classification(n_samples=1000, n_features=4,
                                n_informative=2, n_redundant=2, n_classes=2)
X_train, X_test, Y_train, Y_test = sklearn.cross_validation.train_test_split(X,Y,
                                                                         test_size =0.2)

运行得很好。 我想你还有其他一些问题,但是关于你输入的代码 - 这是它的错误。