RandomizedSearchCV和sklearn.externals.joblib.externals.loky.process_executor.TerminatedWorkerError问题

时间:2019-05-13 16:23:57

标签: python machine-learning xgboost

我正在使用RandomizedSearchCV为我的XGBClassifier学习者找到最佳配置,并遇到TerminatedWorkerError错误

Python代码

rs = RandomizedSearchCV(XGBClassifier(), param_grid, cv=5, scoring='f1_micro', n_jobs=16, verbose=1, n_iter=50000)
rs.fit(X_tr, y_tr)

控制台输出

Fitting 5 folds for each of 50000 candidates, totalling 250000 fits
[Parallel(n_jobs=16)]: Using backend LokyBackend with 16 concurrent workers.
[Parallel(n_jobs=16)]: Done  18 tasks      | elapsed:   30.2s
[Parallel(n_jobs=16)]: Done 168 tasks      | elapsed:  2.4min
[Parallel(n_jobs=16)]: Done 418 tasks      | elapsed:  4.4min
[Parallel(n_jobs=16)]: Done 768 tasks      | elapsed:  8.7min
[Parallel(n_jobs=16)]: Done 1218 tasks      | elapsed: 14.5min
[Parallel(n_jobs=16)]: Done 1768 tasks      | elapsed: 19.9min
[Parallel(n_jobs=16)]: Done 2418 tasks      | elapsed: 27.8min
[Parallel(n_jobs=16)]: Done 3168 tasks      | elapsed: 37.1min
[Parallel(n_jobs=16)]: Done 4018 tasks      | elapsed: 47.0min
[Parallel(n_jobs=16)]: Done 4968 tasks      | elapsed: 57.9min
[Parallel(n_jobs=16)]: Done 6018 tasks      | elapsed: 69.6min
[Parallel(n_jobs=16)]: Done 7168 tasks      | elapsed: 82.3min
[Parallel(n_jobs=16)]: Done 8418 tasks      | elapsed: 96.4min
[Parallel(n_jobs=16)]: Done 9768 tasks      | elapsed: 110.2min
[Parallel(n_jobs=16)]: Done 11218 tasks      | elapsed: 126.7min
[Parallel(n_jobs=16)]: Done 12768 tasks      | elapsed: 143.4min
[Parallel(n_jobs=16)]: Done 14418 tasks      | elapsed: 162.3min
[Parallel(n_jobs=16)]: Done 16168 tasks      | elapsed: 183.0min
[Parallel(n_jobs=16)]: Done 18018 tasks      | elapsed: 203.9min
exception calling callback for <Future at 0x1980abf4908 state=finished raised TerminatedWorkerError>
Traceback (most recent call last):
  File "C:\Users\Huy.Nguyen\AppData\Local\Programs\Python\Python36\lib\site-packages\sklearn\externals\joblib\externals\loky\_base.py", line 625, in _invoke_callbacks
    callback(self)
  File "C:\Users\Huy.Nguyen\AppData\Local\Programs\Python\Python36\lib\site-packages\sklearn\externals\joblib\parallel.py", line 309, in __call__
    self.parallel.dispatch_next()
  File "C:\Users\Huy.Nguyen\AppData\Local\Programs\Python\Python36\lib\site-packages\sklearn\externals\joblib\parallel.py", line 731, in dispatch_next
    if not self.dispatch_one_batch(self._original_iterator):
  File "C:\Users\Huy.Nguyen\AppData\Local\Programs\Python\Python36\lib\site-packages\sklearn\externals\joblib\parallel.py", line 759, in dispatch_one_batch
    self._dispatch(tasks)
  File "C:\Users\Huy.Nguyen\AppData\Local\Programs\Python\Python36\lib\site-packages\sklearn\externals\joblib\parallel.py", line 716, in _dispatch
    job = self._backend.apply_async(batch, callback=cb)
  File "C:\Users\Huy.Nguyen\AppData\Local\Programs\Python\Python36\lib\site-packages\sklearn\externals\joblib\_parallel_backends.py", line 510, in apply_async
    future = self._workers.submit(SafeFunction(func))
  File "C:\Users\Huy.Nguyen\AppData\Local\Programs\Python\Python36\lib\site-packages\sklearn\externals\joblib\externals\loky\reusable_executor.py", line 151, in submit
    fn, *args, **kwargs)
  File "C:\Users\Huy.Nguyen\AppData\Local\Programs\Python\Python36\lib\site-packages\sklearn\externals\joblib\externals\loky\process_executor.py", line 1022, in submit
    raise self._flags.broken
sklearn.externals.joblib.externals.loky.process_executor.TerminatedWorkerError: A worker process managed by the executor was unexpectedly terminated. This could be caused by a segmentation fault while calling the function or by an excessive memory usage causing the Operating System to kill the worker.
Traceback (most recent call last):
  File "main.py", line 117, in <module>
    rs.fit(X_tr, y_tr)
  File "C:\Users\Huy.Nguyen\AppData\Local\Programs\Python\Python36\lib\site-packages\sklearn\model_selection\_search.py", line 680, in fit
    self._run_search(evaluate_candidates)
  File "C:\Users\Huy.Nguyen\AppData\Local\Programs\Python\Python36\lib\site-packages\sklearn\model_selection\_search.py", line 1460, in _run_search
    random_state=self.random_state))
  File "C:\Users\Huy.Nguyen\AppData\Local\Programs\Python\Python36\lib\site-packages\sklearn\model_selection\_search.py", line 669, in evaluate_candidates
    cv.split(X, y, groups)))
  File "C:\Users\Huy.Nguyen\AppData\Local\Programs\Python\Python36\lib\site-packages\sklearn\externals\joblib\parallel.py", line 930, in __call__
    self.retrieve()
  File "C:\Users\Huy.Nguyen\AppData\Local\Programs\Python\Python36\lib\site-packages\sklearn\externals\joblib\parallel.py", line 833, in retrieve
    self._output.extend(job.get(timeout=self.timeout))
  File "C:\Users\Huy.Nguyen\AppData\Local\Programs\Python\Python36\lib\site-packages\sklearn\externals\joblib\_parallel_backends.py", line 521, in wrap_future_result
    return future.result(timeout=timeout)
  File "C:\Users\Huy.Nguyen\AppData\Local\Programs\Python\Python36\lib\concurrent\futures\_base.py", line 432, in result
    return self.__get_result()
  File "C:\Users\Huy.Nguyen\AppData\Local\Programs\Python\Python36\lib\concurrent\futures\_base.py", line 384, in __get_result
    raise self._exception
  File "C:\Users\Huy.Nguyen\AppData\Local\Programs\Python\Python36\lib\site-packages\sklearn\externals\joblib\externals\loky\_base.py", line 625, in _invoke_callbacks
    callback(self)
  File "C:\Users\Huy.Nguyen\AppData\Local\Programs\Python\Python36\lib\site-packages\sklearn\externals\joblib\parallel.py", line 309, in __call__
    self.parallel.dispatch_next()
  File "C:\Users\Huy.Nguyen\AppData\Local\Programs\Python\Python36\lib\site-packages\sklearn\externals\joblib\parallel.py", line 731, in dispatch_next
    if not self.dispatch_one_batch(self._original_iterator):
  File "C:\Users\Huy.Nguyen\AppData\Local\Programs\Python\Python36\lib\site-packages\sklearn\externals\joblib\parallel.py", line 759, in dispatch_one_batch
    self._dispatch(tasks)
  File "C:\Users\Huy.Nguyen\AppData\Local\Programs\Python\Python36\lib\site-packages\sklearn\externals\joblib\parallel.py", line 716, in _dispatch
    job = self._backend.apply_async(batch, callback=cb)
  File "C:\Users\Huy.Nguyen\AppData\Local\Programs\Python\Python36\lib\site-packages\sklearn\externals\joblib\_parallel_backends.py", line 510, in apply_async
    future = self._workers.submit(SafeFunction(func))
  File "C:\Users\Huy.Nguyen\AppData\Local\Programs\Python\Python36\lib\site-packages\sklearn\externals\joblib\externals\loky\reusable_executor.py", line 151, in submit
    fn, *args, **kwargs)
  File "C:\Users\Huy.Nguyen\AppData\Local\Programs\Python\Python36\lib\site-packages\sklearn\externals\joblib\externals\loky\process_executor.py", line 1022, in submit
    raise self._flags.broken
sklearn.externals.joblib.externals.loky.process_executor.TerminatedWorkerError: A worker process managed by the executor was unexpectedly terminated. This could be caused by a segmentation fault while calling the function or by an excessive memory usage causing the Operating System to kill the worker.

要花相当长的一段时间才会出现错误,我一直在Google搜索,但没有找到解决方法。当我崩溃时,我的记忆甚至还不到30%。在另一台计算机上运行相同的代码没有问题。关于如何修复有什么建议吗?

谢谢

0 个答案:

没有答案