我使用GridSearchCV
作为估算工具运行OneVsRestClasssifer
SVC
。这是我的Pipeline
和GridSearchCV
参数的一个方面:
pipeline = Pipeline([
('clf', OneVsRestClassifier(SVC(verbose=True), n_jobs=1)),
])
parameters = {
"clf__estimator__C": [0.1, 1],
"clf__estimator__kernel": ['poly', 'rbf'],
"clf__estimator__degree": [2, 3],
}
grid_search_tune = GridSearchCV(pipeline, parameters, cv=2, n_jobs=8, verbose=10)
grid_search_tune.fit(train_x, train_y)
根据SVC的文档,degree
参数仅由poly
内核使用:
http://scikit-learn.org/stable/modules/generated/sklearn.svm.SVC.html
度:int,可选(默认= 3)
多项式核的度 功能('poly')。被所有其他内核忽略。
但是当我看到GridSearchCV
的输出时,它似乎计算了具有SVC
内核的每个rbf
配置的不同运行以及{{{{}的不同值1}}参数。
degree
当内核设置为[CV] clf__estimator__kernel=poly, clf__estimator__C=0.1, clf__estimator__degree=2
[CV] clf__estimator__kernel=poly, clf__estimator__C=0.1, clf__estimator__degree=2
[CV] clf__estimator__kernel=rbf, clf__estimator__C=0.1, clf__estimator__degree=2
[CV] clf__estimator__kernel=rbf, clf__estimator__C=0.1, clf__estimator__degree=2
[CV] clf__estimator__kernel=poly, clf__estimator__C=0.1, clf__estimator__degree=3
[CV] clf__estimator__kernel=poly, clf__estimator__C=0.1, clf__estimator__degree=3
[CV] clf__estimator__kernel=rbf, clf__estimator__C=0.1, clf__estimator__degree=3
[CV] clf__estimator__kernel=rbf, clf__estimator__C=0.1, clf__estimator__degree=3
时,是否应该忽略所有度数值?
答案 0 :(得分:1)
The output shown here is only the different combinations of parameters passed by the GridSearchCV
to the internal estimator i.e. the SVC
. But whether or not they are used depends upon the SVC
. In this case, the SVC
doesnt throw any error, but also doesnt use the degree
. You should print the scores of all those combinations of which you are doubtful. They should be equal. That will tell you that the degree
parameter is unused.
Note: Make sure to set the random_state
of the GridSearchCV
to duplicate the tests.
Explanation: The work of GridSearchCV is to just pass the parameters, the train data to estimator for fitting, then use the test data for scoring, and results the parameter combinations which resulted in best score.
When an incompatible combination of parameter is passed to an estimator, it is dependent on the implementation, whether the parameters are ignored or an error is raised for it.
For eg. in LogisticRegression, there are two parameters:
penalty : str, ‘l1’ or ‘l2’, default: ‘l2’ Used to specify the norm used in the penalization. solver : {‘newton-cg’, ‘lbfgs’, ‘liblinear’, ‘sag’}, default: ‘liblinear’. Algorithm to use in the optimization problem. ‘newton-cg’, ‘lbfgs’ and ‘sag’ only handle L2 penalty.
As you can see, if I use l1
penalty with newton-cg
solver, it results in incompatibility. So the estimator may chose to ignore the penalty paramter altogether or throw an error. In this case it throws an error.