设置我的n_jobs参数时,我收到以下错误> 1为随机森林回归量。如果我设置n_jobs = 1,一切正常。
AttributeError:'线程'对象没有属性' _children'
我在烧瓶服务中运行此代码。有趣的是,在烧瓶服务之外运行时不会发生这种情况。我只是在新安装的Ubuntu盒子上复制了这个。在我的Mac上,它运行得很好。
这是一个讨论这个问题的主题,但似乎没有超越解决方法 'Thread' object has no attribute '_children' - django + scikit-learn
对此有何想法?
谢谢大家!
这是我的测试代码:
@test.route('/testfun') def testfun(): from sklearn.ensemble import RandomForestRegressor import numpy as np train_data = np.array([[1,2,3], [2,1,3]]) target_data = np.array([1,1]) model = RandomForestRegressor(n_jobs=2) model.fit(train_data, target_data) return "yey"
堆栈跟踪:
Traceback (most recent call last): File "/usr/local/lib/python2.7/dist-packages/flask/app.py", line 1836, in __call__ return self.wsgi_app(environ, start_response) File "/usr/local/lib/python2.7/dist-packages/flask/app.py", line 1820, in wsgi_app response = self.make_response(self.handle_exception(e)) File "/usr/local/lib/python2.7/dist-packages/flask/app.py", line 1403, in handle_exception reraise(exc_type, exc_value, tb) File "/usr/local/lib/python2.7/dist-packages/flask/app.py", line 1817, in wsgi_app response = self.full_dispatch_request() File "/usr/local/lib/python2.7/dist-packages/flask/app.py", line 1477, in full_dispatch_request rv = self.handle_user_exception(e) File "/usr/local/lib/python2.7/dist-packages/flask/app.py", line 1381, in handle_user_exception reraise(exc_type, exc_value, tb) File "/usr/local/lib/python2.7/dist-packages/flask/app.py", line 1475, in full_dispatch_request rv = self.dispatch_request() File "/usr/local/lib/python2.7/dist-packages/flask/app.py", line 1461, in dispatch_request return self.view_functions[rule.endpoint](**req.view_args) File "/home/vagrant/flask.global-relevance-engine/global_relevance_engine/routes/test.py", line 47, in testfun model.fit(train_data, target_data) File "/usr/local/lib/python2.7/dist-packages/sklearn/ensemble/forest.py", line 273, in fit for i, t in enumerate(trees)) File "/usr/local/lib/python2.7/dist-packages/sklearn/externals/joblib/parallel.py", line 574, in __call__ self._pool = ThreadPool(n_jobs) File "/usr/lib/python2.7/multiprocessing/pool.py", line 685, in __init__ Pool.__init__(self, processes, initializer, initargs) File "/usr/lib/python2.7/multiprocessing/pool.py", line 136, in __init__ self._repopulate_pool() File "/usr/lib/python2.7/multiprocessing/pool.py", line 199, in _repopulate_pool w.start() File "/usr/lib/python2.7/multiprocessing/dummy/__init__.py", line 73, in start self._parent._children[self] = None
答案 0 :(得分:6)
这可能是由于multiprocessing.dummy
中的错误(请参阅here和here)在python 2.7.5和3.3.2之前存在。
请参阅评论以确认新版本适用于OP。
dummy
如果您无法升级但有权访问.../py/Lib/multiprocessing/dummy/__init__.py
,请按以下方式编辑start
类中的DummyProcess
方法(应该是〜第73行):
if hasattr(self._parent, '_children'): # add this line
self._parent._children[self] = None # indent this existing line
DummyProcess
是此错误存在的地方。让我们看一下导入代码中的位置,以确保我们在正确的位置进行修补。
该链中存在DummyProcess
可确保在导入RandomForestRegressor
后导入该DummyProcess
。
此外,我认为我们可以在任何实例之前访问# Let's make it available in our namespace:
from sklearn.ensemble import RandomForestRegressor
from multiprocessing import dummy as __mp_dummy
# Now we can define a replacement and patch DummyProcess:
def __DummyProcess_start_patch(self): # pulled from an updated version of Python
assert self._parent is __mp_dummy.current_process() # modified to avoid further imports
self._start_called = True
if hasattr(self._parent, '_children'):
self._parent._children[self] = None
__mp_dummy.threading.Thread.start(self) # modified to avoid further imports
__mp_dummy.DummyProcess.start = __DummyProcess_start_patch
类。
因此,我们可以修改一次类,而不是需要搜索实例来修补。
DummyProcess
除非我遗漏了某些内容,否则从现在起,所有DummyProcess实例都会被修补,因此不会发生错误。
对于任何更广泛使用sklearn的人,我认为你可以反过来做到这一点,并使其适用于所有sklearn而不是专注于一个模块。
在进行任何sklearn导入之前,您需要导入if hasattr(self._parent, '_children'):
self._parent._children[self] = None
并对其进行修补。
然后sklearn将从一开始就使用补丁类。
原始答案:
当我写评论时,我意识到我可能已经找到了你的问题 - 我认为你的烧瓶环境正在使用旧版本的python。
原因是在最新版本的python多处理中,您收到该错误的行受条件保护:
RewriteCond %{REQUEST_URI} !^/node/
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*)$ / [L,QSA]
RewriteCond %{REQUEST_URI} !^/node/
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ^(.+) mydomain.com [R=301,L]
看起来this bug在python 2.7中被修复了(我认为从2.7.5修复)。也许你的烧瓶是2.7或2.6?
你能检查一下你的环境吗?如果你无法更新解释器,也许我们可以找到一种方法来修补多处理,以防止它崩溃。