Question

我在某些数据集中应用了SVM（scikit-learn），并希望找到能够为测试集提供最佳精度的C和gamma值。

我首先将C修改为某个整数，然后迭代多个伽玛值，直到我得到了伽玛，这给了我最佳的测试集精度。然后我修复了我在上面的步骤得到的伽玛，迭代C的值并找到一个可以给我最高精度的C等等......

但是上述步骤永远无法提供最佳的伽玛和C组合，从而产生最佳的测试集精度。

任何人都可以帮助我找到出路来获得这个组合（伽玛，C） sckit-learn？

Answer 1

您正在寻找超参数调整。在参数调整中，我们传递一个字典，其中包含您的分类器的可能值列表，然后根据您选择的方法（即GridSearchCV，RandomSearch等）返回最佳参数。您可以阅读更多相关信息here。

例如：

#Create a dictionary of possible parameters
params_grid = {'C': [0.001, 0.01, 0.1, 1, 10, 100],
          'gamma': [0.0001, 0.001, 0.01, 0.1],
          'kernel':['linear','rbf'] }

#Create the GridSearchCV object
grid_clf = GridSearchCV(SVC(class_weight='balanced'), params_grid)

#Fit the data with the best possible parameters
grid_clf = clf.fit(X_train, y_train)

#Print the best estimator with it's parameters
print grid_clf.best_estimators

您可以阅读有关GridSearchCV here和RandomizedSearchCV here的更多信息。但需要注意的是，SVM需要大量的CPU功率，因此请注意您传递的参数数量。处理可能需要一些时间，具体取决于您的数据和传递的参数数量。

This link也包含一个示例

查找C和gamma的值以优化SVM

1 个答案: