我尝试使用GridSearchCV来优化分类器svm.SVC的参数(均来自sklearn)。
from sklearn.grid_search import GridSearchCV
from sklearn.svm import SVC
from sklearn.metrics import confusion_matrix
import numpy as np
X_train = np.array([[1,2],[3,4],[5,6],[2,3],[9,4],[4,5],[2,7],[1,0],[4,7],[2,9])
Y_train = np.array([0,1,0,1,0,0,1,1,0,1])
X_test = np.array([[2,4],[5,3],[7,1],[2,4],[6,4],[2,7],[9,2],[7,5],[1,6],[0,3]])
Y_test = np.array([1,0,0,0,1,0,1,1,0,0])
parameters = {'kernel':['rbf'],'C':np.linspace(10,100,10)}
clf1 = GridSearchCV(SVC(), parameters, verbose = 10)
clf1.fit(X_train, Y_train)
cm = confusion_matrix(Y_test, clf1.predict(X_test))
bp = clf1.best_params_
输出显示它完成GridSearchCV,但随后它抛出错误:
Traceback (most recent call last):
File "<ipython console>", line 1, in <module>
File "C:\Python27\lib\site-packages\spyderlib\widgets\externalshell\sitecustomize.py", line 479, in runfile
execfile(filename, namespace)
File "I:\setup\Desktop\Stats\FinalProject.py", line 112, in <module>
clf1 = GridSearchCV(SVC(), parameters, verbose = 10)
TypeError: 'dict' object is not callable
答案 0 :(得分:0)
当我运行您发布的代码时:
from sklearn.grid_search import GridSearchCV
from sklearn.svm import SVC
from sklearn.metrics import confusion_matrix
import numpy as np
X_train = np.array([[1,2],[3,4],[5,6]])
Y_train = np.array([0,1,0])
X_test = np.array([[2,4],[5,3],[7,1]])
Y_test = np.array([1,0,0])
parameters = {'kernel':['rbf'],'C':np.linspace(10,100,10)}
clf1 = GridSearchCV(SVC(), parameters, verbose = 10)
clf1.fit(X_train, Y_train)
cm = confusion_matrix(Y_test, clf1.predict(X_test))
bp = clf1.best_params_
我收到此错误:
文件&#34; C:\ Anaconda \ lib \ site-packages \ sklearn \ svm \ base.py&#34;,第447行,在_validate_targets中 %len(cls)) ValueError:类的数量必须大于1;得到1
由于列车数据由3个样本组成,当GridSearchCV
将数据分成3倍时(BTW你可以控制这个参数,它被称为cv
)。
e.g。 -
fold1 = [1,2] , label1 = 0
fold2 = [3,4] , label2 = 1
fold3 = [5,6] , label3 = 0
现在,在某些迭代中,需要第一次和第三次折叠训练,第二次折叠用于验证。 请注意,这些训练折叠只包含一种标签! (标签0)因此打印出错误。
如果我以这种方式创建数据:
X, Y = datasets.make_classification(n_samples=1000, n_features=4,
n_informative=2, n_redundant=2, n_classes=2)
X_train, X_test, Y_train, Y_test = sklearn.cross_validation.train_test_split(X,Y,
test_size =0.2)
运行得很好。 我想你还有其他一些问题,但是关于你输入的代码 - 这是它的错误。