在运行sklearn GridSearchCV时,Pandas提供600行长的ValueError

时间:2015-04-17 05:01:11

标签: python debugging pandas

我使用sklearn GridSearchCV来分类一些数据。当我运行这段代码时,pandas给出了600行长的错误...如果我设置n_jobs = 1,它很慢但它工作正常没有任何错误。我做错了吗?如果这个很大,有什么我可以做的报道吗?感谢

Sklearn版本:0.16.1,熊猫版本:0.16.0,机器:Ubuntu 14.04LTS

from sklearn.ensemble import RandomForestClassifier

rfc = RandomForestClassifier(random_state=1)
tuned_parameters = [{'max_features': ['sqrt', 'log2'], 
                 'n_estimators': [20, 100, 200, 500, 1000]}]

clf = GridSearchCV(rfc, tuned_parameters, scoring='roc_auc', cv=3, n_jobs=-1, verbose=2)
clf.fit(X, y)

错误讯息:

JoblibValueError                          Traceback (most recent call last)
<ipython-input-21-99a52d0f19db> in <module>()
  4 
  5 clf = GridSearchCV(rfc, tuned_parameters, scoring='roc_auc', cv=3, n_jobs=-1, verbose=2)
----> 6 clf.fit(X, y)

/home/user/anaconda/lib/python2.7/site-packages/sklearn/grid_search.pyc in fit(self, X, y)
730 
731         """
--> 732         return self._fit(X, y, ParameterGrid(self.param_grid))
733 
734 

/home/user/anaconda/lib/python2.7/site-packages/sklearn/grid_search.pyc in _fit(self, X, y, parameter_iterable)
503                                     self.fit_params, return_parameters=True,
504                                     error_score=self.error_score)
--> 505                 for parameters in parameter_iterable
506                 for train, test in cv)
507 

/home/user/anaconda/lib/python2.7/site-packages/sklearn/externals/joblib/parallel.pyc in __call__(self, iterable)
664                 # consumption.
665                 self._iterating = False
--> 666             self.retrieve()
667             # Make sure that we get a last message telling us we are done
668             elapsed_time = time.time() - self._start_time

/home/user/anaconda/lib/python2.7/site-packages/sklearn/externals/joblib/parallel.pyc in retrieve(self)
547                         # Convert this to a JoblibException
548                         exception_type = _mk_exception(exception.etype)[0]
--> 549                         raise exception_type(report)
550                     raise exception
551                 finally:

JoblibValueError: JoblibValueError
___________________________________________________________________________
Multiprocessing exception:
这里之间600行...
/home/user/anaconda/lib/python2.7/site-packages/pandas/algos.so in View.MemoryView.memoryview.__cinit__ (pandas/algos.c:172387)()
316 
317 
318 
319 
320 
--> 321 
322 
323 
324 
325 

ValueError: buffer source array is read-only

0 个答案:

没有答案