如何使用scipy使用gridSearch CV?

时间:2016-08-09 09:23:08

标签: pandas grid-search sklearn-pandas

我一直在尝试使用Gridsearchcv来调整我的SVM,但它会抛出错误。

我的代码是:

train = pd.read_csv('train_set.csv')
label = pd.read.csv('lebel.csv')

params = { 'C' : [ 0.01 , 0.1 , 1 , 10]
clf = GridSearchCV(SVC() , params , n_jobs = -1)
clf.fit(train , label)

将错误抛出为:'数组'

的索引太多

但是当我这么做时:

clf = svc()
clf.fit(train.data , label.data)

代码工作正常

1 个答案:

答案 0 :(得分:1)

我怀疑问题在于你的数据结构train.data / label.data。我已经测试了两个版本的代码并且它们可以工作:

import sklearn.svm as sksvm
import sklearn.grid_search as skgs

params = { 'C' : [ 0.01 , 0.1 , 1 , 10]}
X = np.random.rand(1000, 10)  # (1000 x 10) matrix, 1000 points with 10 features
Y = np.random.randint(0, 2, 1000)  # 1000 array, binary labels

mod = sksvm.SVC()
mod.fit(X, Y)

输出:

SVC(C=1.0, cache_size=200, class_weight=None, coef0=0.0, degree=3, gamma=0.0,
  kernel='rbf', max_iter=-1, probability=False, random_state=None,
  shrinking=True, tol=0.001, verbose=False)

import sklearn.svm as sksvm
import sklearn.grid_search as skgs

params = { 'C' : [ 0.01 , 0.1 , 1 , 10]}
X = np.random.rand(1000, 10)  # (1000 x 10) matrix, 1000 points with 10 features
Y = np.random.randint(0, 2, 1000)  # 1000 array, binary labels

mod = skgs.GridSearchCV(sksvm.SVC(), params, n_jobs=-1)
mod.fit(X, Y)

输出:

GridSearchCV(cv=None, error_score='raise',
       estimator=SVC(C=1.0, cache_size=200, class_weight=None, coef0=0.0, degree=3, gamma=0.0,
  kernel='rbf', max_iter=-1, probability=False, random_state=None,
  shrinking=True, tol=0.001, verbose=False),
       fit_params={}, iid=True, loss_func=None, n_jobs=-1,
       param_grid={'C': [0.01, 0.1, 1, 10]}, pre_dispatch='2*n_jobs',
       refit=True, score_func=None, scoring=None, verbose=0)

如果您的数据是数据框和序列,代码仍然有效,您可以通过添加:

来尝试
X = pd.DataFrame(X)
Y = pd.Series(Y)

生成X和Y后。

如果没有可重复的代码,很难说。您也可以在标题中添加sklearn标签。