我是scikit-learn的新手,我想用scikit-learn GridSearch找到多标签分类问题的最佳参数。我不能让它工作,我很确定标签有问题。
我的代码如下所示:
X, Y = load_svmlight_file( TRAIN_FILE, dtype=np.float64, multilabel=True )
clf_pipeline = OneVsRestClassifier(
Pipeline([('pca', RandomizedPCA()),
('clf', SVC())
]))
#grid search parameters
c_range = 10.0 ** np.arange(-2, 9)
gamma_range = 10.0 ** np.arange(-5, 4)
n_components_range = (10, 100, 200)
degree_range = (1, 2, 3, 4)
#grid search
param_grid = dict(estimator__clf__gamma=gamma_range,
estimator__clf__c=c_range,
estimator__clf__degree=degree_range,
estimator__pca__n_components=n_components_range)
grid = GridSearchCV(clf_pipeline, param_grid, verbose=2)
grid.fit(X, Y)
出现在“grid.fit(X,Y)”行中。 回溯:
File "C:\Python27\lib\site-packages\sklearn\grid_search.py", line 597, in fit
return self._fit(X, y, ParameterGrid(self.param_grid))
File "C:\Python27\lib\site-packages\sklearn\grid_search.py", line 359, in _fit
cv = check_cv(cv, X, y, classifier=is_classifier(estimator))
File "C:\Python27\lib\site-packages\sklearn\cross_validation.py", line 1361, in _check_cv
cv = StratifiedKFold(y, cv, indices=needs_indices)
File "C:\Python27\lib\site-packages\sklearn\cross_validation.py", line 429, in __init__
label_test_folds = test_folds[y == label]
IndexError: too many indices for array
我使用scikit-learn 0.15。
EDIT1。该代码在Linux中运行良好,但在Windows 7 64位
上运行不正常