scikit在KNeighbors上学习GridSearchCV

时间:2015-01-31 21:09:21

标签: python scikit-learn

我无法使用KNeighbors Classifiers使用GridSearchCV。发出grid.fit(dataImp,y)时出现以下错误:

TypeError:“ init ()得到了一个意外的关键字参数'p'”

使用任何数据都可以重现错误。引用的数据只是用于测试的虚拟数据。

下面重现的代码:

from sklearn.grid_search import GridSearchCV
from sklearn import cross_validation
from sklearn import neighbors
import numpy as np

dataImpNew = np.transpose(np.atleast_2d(np.arange(20.)))*np.arange(20.)
yNew       = np.sign(np.arange(-5.5,14))
nFolds = 4
random_state  = 1234 
metrics       = ['minkowski','euclidean','manhattan'] 
weights       = ['uniform','distance'] #10.0**np.arange(-5,4)
numNeighbors  = np.arange(5,10)
param_grid    = dict(metric=metrics,weights=weights,n_neighbors=numNeighbors)
cv            = cross_validation.StratifiedKFold(yNew,nFolds)
grid = GridSearchCV(neighbors.KNeighborsClassifier(),param_grid=param_grid,cv=cv)
grid.fit(dataImpNew,yNew)

完整引用:

Traceback (most recent call last):
  File "/home/pjvalla/testDir/test.py", line 25, in <module>
grid.fit(dataImpNew,yNew)
  File "/usr/lib/python2.7/dist-packages/sklearn/grid_search.py", line 596, in fit
return self._fit(X, y, ParameterGrid(self.param_grid))
 File "/usr/lib/python2.7/dist-packages/sklearn/grid_search.py", line 378, in _fit
for parameters in parameter_iterable
  File "/usr/lib/python2.7/dist-packages/joblib/parallel.py", line 653, in __call__
self.dispatch(function, args, kwargs)
  File "/usr/lib/python2.7/dist-packages/joblib/parallel.py", line 400, in dispatch
job = ImmediateApply(func, args, kwargs)
  File "/usr/lib/python2.7/dist-packages/joblib/parallel.py", line 138, in __init__
self.results = func(*args, **kwargs)
  File "/usr/lib/python2.7/dist-packages/sklearn/cross_validation.py", line 1239, in _fit_and_score
estimator.fit(X_train, y_train, **fit_params)
  File "/usr/lib/python2.7/dist-packages/sklearn/neighbors/base.py", line 628, in fit
return self._fit(X)
  File "/usr/lib/python2.7/dist-packages/sklearn/neighbors/base.py", line 217, in _fit
**self.effective_metric_kwds_)
  File "binary_tree.pxi", line 1062, in sklearn.neighbors.kd_tree.BinaryTree.__init__ (sklearn/neighbors/kd_tree.c:8380)
  File "dist_metrics.pyx", line 280, in sklearn.neighbors.dist_metrics.DistanceMetric.get_metric (sklearn/neighbors/dist_metrics.c:4066)
TypeError: __init__() got an unexpected keyword argument 'p'

1 个答案:

答案 0 :(得分:3)

适合我,虽然我必须重命名dataImpNewyNew(删除&#39;新&#39;部分):

In [4]: %cpaste
Pasting code; enter '--' alone on the line to stop or use Ctrl-D.
:from sklearn.grid_search import GridSearchCV
:from sklearn import cross_validation
:from sklearn import neighbors
:import numpy as np
:
:dataImp = np.transpose(np.atleast_2d(np.arange(20.)))*np.arange(20.)
:y       = np.sign(np.arange(-5.5,14))
:nFolds = 4
:random_state  = 1234 
:metrics       = ['minkowski','euclidean','manhattan'] 
:weights       = ['uniform','distance'] #10.0**np.arange(-5,4)
:numNeighbors  = np.arange(5,10)
:param_grid    = dict(metric=metrics,weights=weights,n_neighbors=numNeighbors)
:cv            = cross_validation.StratifiedKFold(y,nFolds)
:grid = GridSearchCV(neighbors.KNeighborsClassifier(),param_grid=param_grid,cv=cv)
:grid.fit(dataImp,y)
:
:<EOF>
Out[4]: 
GridSearchCV(cv=sklearn.cross_validation.StratifiedKFold(labels=[-1. -1. -1. -1. -1. -1.  1.  1.  1.  1.  1.  1.  1.  1.  1.  1.  1.  1.
  1.  1.], n_folds=4, shuffle=False, random_state=None),
       estimator=KNeighborsClassifier(algorithm='auto', leaf_size=30, metric='minkowski',
           metric_params=None, n_neighbors=5, p=2, weights='uniform'),
       fit_params={}, iid=True, loss_func=None, n_jobs=1,
       param_grid={'n_neighbors': array([5, 6, 7, 8, 9]), 'metric': ['minkowski', 'euclidean', 'manhattan'], 'weights': ['uniform', 'distance']},
       pre_dispatch='2*n_jobs', refit=True, score_func=None, scoring=None,
       verbose=0)

你可以发布完整的堆栈跟踪吗?