当n_jobs> 1时,scikit-learn的GridSearchCV停止工作

时间:2014-08-12 08:51:27

标签: python numpy scikit-learn

我之前曾要求here提出以下代码行:

parameters = [{'weights': ['uniform'], 'n_neighbors': [5, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100]}]
clf = GridSearchCV(neighbors.KNeighborsRegressor(), parameters, n_jobs=4)
clf.fit(features, rewards)

但是,当我运行这个时,出现了另一个与之前提出的问题无关的问题。 Python最终会出现以下操作系统错误消息:

Process:         Python [1327]
Path:            /Library/Frameworks/Python.framework/Versions/2.7/Resources/Python.app/Contents/MacOS/Python
Identifier:      Python
Version:         2.7.2.5 (2.7.2.5.r64662-trunk)
Code Type:       X86-64 (Native)
Parent Process:  Python [1316]
Responsible:     Sublime Text 2 [308]
User ID:         501

Date/Time:       2014-08-12 10:27:24.640 +0200
OS Version:      Mac OS X 10.9.4 (13E28)
Report Version:  11
Anonymous UUID:  D10CD8B7-221F-B121-98D4-4574A1F2189F

Sleep/Wake UUID: 0B9C4AE0-26E6-4DE8-B751-665791968115

Crashed Thread:  0  Dispatch queue: com.apple.main-thread

Exception Type:  EXC_BAD_ACCESS (SIGSEGV)
Exception Codes: KERN_INVALID_ADDRESS at 0x0000000000000110

VM Regions Near 0x110:
--> 
__TEXT                 0000000100000000-0000000100001000 [    4K] r-x/rwx SM=COW  /Library/Frameworks/Python.framework/Versions/2.7/Resources/Python.app/Contents/MacOS/Python

Application Specific Information:
*** multi-threaded process forked ***
crashed on child side of fork pre-exec

Thread 0 Crashed:: Dispatch queue: com.apple.main-thread
0   libdispatch.dylib               0x00007fff91534c90 dispatch_group_async_f + 141
1   libBLAS.dylib                   0x00007fff9413f791 APL_sgemm + 1061
2   libBLAS.dylib                   0x00007fff9413cb3f cblas_sgemm + 1267
3   _dotblas.so                     0x0000000102b0236e dotblas_matrixproduct + 5934
4   org.activestate.ActivePython27  0x00000001000c552d PyEval_EvalFrameEx + 23949
5   org.activestate.ActivePython27  0x00000001000c7ad6 PyEval_EvalCodeEx + 2118
6   org.activestate.ActivePython27  0x00000001000c5d10 PyEval_EvalFrameEx + 25968
7   org.activestate.ActivePython27  0x00000001000c7ad6 PyEval_EvalCodeEx + 2118
8   org.activestate.ActivePython27  0x000000010003d390 function_call + 176
9   org.activestate.ActivePython27  0x000000010000be12 PyObject_Call + 98
10  org.activestate.ActivePython27  0x00000001000c098a PyEval_EvalFrameEx + 4586
11  org.activestate.ActivePython27  0x00000001000c7ad6 PyEval_EvalCodeEx + 2118
12  org.activestate.ActivePython27  0x00000001000c5d10 PyEval_EvalFrameEx + 25968
13  org.activestate.ActivePython27  0x00000001000c7ad6 PyEval_EvalCodeEx + 2118
14  org.activestate.ActivePython27  0x00000001000c5d10 PyEval_EvalFrameEx + 25968
15  org.activestate.ActivePython27  0x00000001000c7137 PyEval_EvalFrameEx + 31127
16  org.activestate.ActivePython27  0x00000001000c7137 PyEval_EvalFrameEx + 31127
17  org.activestate.ActivePython27  0x00000001000c7ad6 PyEval_EvalCodeEx + 2118
18  org.activestate.ActivePython27  0x000000010003d390 function_call + 176
19  org.activestate.ActivePython27  0x000000010000be12 PyObject_Call + 98
20  org.activestate.ActivePython27  0x00000001000c098a PyEval_EvalFrameEx + 4586
21  org.activestate.ActivePython27  0x00000001000c7ad6 PyEval_EvalCodeEx + 2118
22  org.activestate.ActivePython27  0x000000010003d390 function_call + 176
23  org.activestate.ActivePython27  0x000000010000be12 PyObject_Call + 98
24  org.activestate.ActivePython27  0x000000010001d36d instancemethod_call + 365
25  org.activestate.ActivePython27  0x000000010000be12 PyObject_Call + 98
26  org.activestate.ActivePython27  0x0000000100077dfa slot_tp_call + 74
27  org.activestate.ActivePython27  0x000000010000be12 PyObject_Call + 98
28  org.activestate.ActivePython27  0x00000001000c098a PyEval_EvalFrameEx + 4586
29  org.activestate.ActivePython27  0x00000001000c7ad6 PyEval_EvalCodeEx + 2118
30  org.activestate.ActivePython27  0x000000010003d390 function_call + 176
31  org.activestate.ActivePython27  0x000000010000be12 PyObject_Call + 98
32  org.activestate.ActivePython27  0x00000001000c098a PyEval_EvalFrameEx + 4586
33  org.activestate.ActivePython27  0x00000001000c7137 PyEval_EvalFrameEx + 31127
34  org.activestate.ActivePython27  0x00000001000c7137 PyEval_EvalFrameEx + 31127
35  org.activestate.ActivePython27  0x00000001000c7ad6 PyEval_EvalCodeEx + 2118
36  org.activestate.ActivePython27  0x000000010003d390 function_call + 176
37  org.activestate.ActivePython27  0x000000010000be12 PyObject_Call + 98
38  org.activestate.ActivePython27  0x000000010001d36d instancemethod_call + 365
39  org.activestate.ActivePython27  0x000000010000be12 PyObject_Call + 98
40  org.activestate.ActivePython27  0x0000000100077a28 slot_tp_init + 88
41  org.activestate.ActivePython27  0x0000000100074e25 type_call + 245
42  org.activestate.ActivePython27  0x000000010000be12 PyObject_Call + 98
43  org.activestate.ActivePython27  0x00000001000c267d PyEval_EvalFrameEx + 11997  
44  org.activestate.ActivePython27  0x00000001000c7137 PyEval_EvalFrameEx + 31127
45  org.activestate.ActivePython27  0x00000001000c7137 PyEval_EvalFrameEx + 31127
46  org.activestate.ActivePython27  0x00000001000c7ad6 PyEval_EvalCodeEx + 2118
47  org.activestate.ActivePython27  0x000000010003d390 function_call + 176
48  org.activestate.ActivePython27  0x000000010000be12 PyObject_Call + 98
49  org.activestate.ActivePython27  0x000000010001d36d instancemethod_call + 365
50  org.activestate.ActivePython27  0x000000010000be12 PyObject_Call + 98
51  org.activestate.ActivePython27  0x0000000100077a28 slot_tp_init + 88
52  org.activestate.ActivePython27  0x0000000100074e25 type_call + 245
53  org.activestate.ActivePython27  0x000000010000be12 PyObject_Call + 98
54  org.activestate.ActivePython27  0x00000001000c267d PyEval_EvalFrameEx + 11997
55  org.activestate.ActivePython27  0x00000001000c7ad6 PyEval_EvalCodeEx + 2118
56  org.activestate.ActivePython27  0x00000001000c5d10 PyEval_EvalFrameEx + 25968
57  org.activestate.ActivePython27  0x00000001000c7ad6 PyEval_EvalCodeEx + 2118
58  org.activestate.ActivePython27  0x000000010003d390 function_call + 176
59  org.activestate.ActivePython27  0x000000010000be12 PyObject_Call + 98   
60  org.activestate.ActivePython27  0x000000010001d36d instancemethod_call + 365
61  org.activestate.ActivePython27  0x000000010000be12 PyObject_Call + 98
62  org.activestate.ActivePython27  0x0000000100077dfa slot_tp_call + 74
63  org.activestate.ActivePython27  0x000000010000be12 PyObject_Call + 98
64  org.activestate.ActivePython27  0x00000001000c267d PyEval_EvalFrameEx + 11997
65  org.activestate.ActivePython27  0x00000001000c7ad6 PyEval_EvalCodeEx + 2118
66  org.activestate.ActivePython27  0x00000001000c5d10 PyEval_EvalFrameEx + 25968
67  org.activestate.ActivePython27  0x00000001000c7ad6 PyEval_EvalCodeEx + 2118
68  org.activestate.ActivePython27  0x00000001000c5d10 PyEval_EvalFrameEx + 25968
69  org.activestate.ActivePython27  0x00000001000c7ad6 PyEval_EvalCodeEx + 2118
70  org.activestate.ActivePython27  0x00000001000c5d10 PyEval_EvalFrameEx + 25968
71  org.activestate.ActivePython27  0x00000001000c7ad6 PyEval_EvalCodeEx + 2118
72  org.activestate.ActivePython27  0x00000001000c7bf6 PyEval_EvalCode + 54
73  org.activestate.ActivePython27  0x00000001000ed31e PyRun_FileExFlags + 174
74  org.activestate.ActivePython27  0x00000001000ed5d9 PyRun_SimpleFileExFlags + 489
75  org.activestate.ActivePython27  0x00000001001041dc Py_Main + 2940
76  org.activestate.ActivePython27.app  0x0000000100000ed4 0x100000000 + 3796

Thread 0 crashed with X86 Thread State (64-bit):
rax: 0x0000000000000100  rbx: 0x00007fff7cd43640  rcx: 0x0000000000000000  rdx: 0x0000000105e00000
rdi: 0x0000000000000008  rsi: 0x0000000105e01000  rbp: 0x00007fff5fbfa370  rsp: 0x00007fff5fbfa350
r8: 0x0000000000000001   r9: 0x0000000105e00000  r10: 0x0000000105e01000  r11: 0x0000000000000000
r12: 0x000000010ba10530  r13: 0x000000010b000000  r14: 0x00000001066d1970  r15: 0x00007fff915311af
rip: 0x00007fff91534c90  rfl: 0x0000000000010206  cr2: 0x0000000000000110

Logical CPU:     2
Error Code:      0x00000006
Trap Number:     14

.........
VM Region Summary:
ReadOnly portion of Libraries: Total=183.7M resident=97.0M(53%) swapped_out_or_unallocated=86.7M(47%)
Writable regions: Total=1.3G written=142.8M(11%) resident=503.6M(39%) swapped_out=0K(0%) unallocated=791.7M(61%)

当我在代码中替换第二行时:

clf = GridSearchCV(neighbors.KNeighborsRegressor(), parameters, n_jobs=1)

然后一切正常,除了我不使用多个线程。

我的操作系统是OSX 10.9.4

我的python版本是2.7.8 | Anaconda 2.0.1(x86_64)| (默认,2014年7月2日,15:36:00) [GCC 4.2.1(Apple Inc. build 5577)]

我的scikit-lern版本是0.14.1

我的numpy版本是1.8.1

我的scipy版本是0.14.0

我的问题是,是否有人知道如何在多个线程上运行GridSearchCV?

修改

我已经意识到实际上这个错误只发生在我的一些输入数据集上。不幸的是,有问题的数据集(它的X)太大,所以无法在这里复制它们。输入特征数据基本上是tf-idf向量,y向量是浮点数> 0,特别是:

[60.0, 7.0, 12.0, 21.0, 5.5, 3.0, 0.0, 2.5, 11.0, 3.0, 16.0, 2.0, 0.0, 4.5, 2.5, 6.0, 9.5, 2.5, 15.0, 7.0, 8.0, 13.0, 14.0, 8.0, 3.5, 6.0, 22.5, 7.0, 4.0, 3.5, 4.5, 6.0, 5.5, 7.0, 2.0, 0.0, 0.0, 0.0, 14.5, 8.0, 7.5, 2.5, 11.5, 1.0, 3.0, 14.5, 10.0, 14.5, 8.0, 8.0, 7.0, 2.5, 3.5, 3.0, 13.5, 7.0, 6.5, 2.5, 9.0, 8.0, 11.0, 17.5, 12.5, 4.5, 5.5, 8.0, 2.0, 7.0, 4.0, 1.5, 3.0, 21.5, 4.5, 4.0, 7.0, 9.0, 13.5, 8.0, 10.5, 4.5, 1.5, 11.5, 7.5, 11.5, 4.5, 5.0, 7.0, 9.5, 4.0, 4.0, 6.0, 3.5, 4.5, 7.5, 3.5, 3.5, 3.5, 6.0, 5.0, 5.5, 25.0, 6.5, 5.0, 2.0, 2.0, 10.5, 0.0, 6.5, 19.0, 9.0, 1.0, 1.5, 1.0, 0.0, 1.0, 4.5, 2.5, 17.5, 39.5, 7.5, 5.5, 8.0, 1.0, 6.0, 12.0, 10.0, 5.5, 19.0, 4.5, 1.5, 25.5, 4.0, 10.0, 18.5, 9.5, 10.5, 2.5, 6.0, 1.0, 10.0, 8.5, 12.5, 13.5, 5.0, 6.5, 11.0, 4.5, 8.0, 7.5, 11.5, 14.5, 9.0, 3.0, 1.5, 3.5, 5.5, 2.5, 12.5, 6.5, 5.5, 5.0, 0.0, 8.0, 3.0, 14.5, 5.0, 14.0, 7.0, 13.5, 12.5, 4.0, 1.5, 6.5, 10.5, 9.0, 16.5, 4.0, 4.0, 15.0, 11.5, 2.5, 8.5, 3.0, 5.0, 4.0, 8.5, 6.0, 5.0, 5.0, 5.0, 5.5, 8.0, 11.0, 4.0, 0.0, 5.5, 0.0, 4.5, 1.5, 0.0, 6.5, 11.0, 2.5, 8.0, 15.5, 5.5, 4.5, 5.0, 4.0, 5.5, 10.5, 7.5, 6.5, 8.5, 2.5, 1.5, 1.5, 18.0, 15.0, 14.0, 9.5, 5.5, 7.5, 14.5, 2.5, 5.0, 60.0, 6.5, 14.5, 6.5, 4.0, 1.5, 2.0, 4.0, 27.0, 3.0, 5.0, 4.0, 2.5, 1.0, 1.5, 1.5, 9.0, 4.0, 8.5, 4.0, 4.0, 0.0, 1.5, 7.5, 1.5, 7.5, 1.0, 28.5, 15.5, 7.5, 1.0, 2.5, 2.5, 2.5, 16.0, 5.5, 8.5, 4.0, 2.5, 5.0, 2.5, 6.0, 11.0, 10.0, 4.5, 6.5, 8.0, 6.0, 4.5, 15.5, 4.0, 5.0]

包含1个作业的版本适用于我的所有输入数据集,即使是这个也是如此。

2 个答案:

答案 0 :(得分:4)

来自Grand Central Dispatch的

libdispatch.dylib在执行numpy.dot调用时由OSX内置的BLAS实现内部使用,名为Accelerate。当程序在不使用fork系统调用之后调用POSIX exec系统调用时,GCD运行时不起作用,因此使得使用multiprocessing模块的所有Python程序都容易崩溃。 sklearn的GridsearchCV使用Python multiprocessing模块进行并行化。

在Python 3.4及更高版本中,您可以强制Python多处理使用forkserver start method而不是默认的fork模式来解决此问题,例如在程序主文件的开头:< / p>

if __name__ == "__main__":
    import multiprocessing as mp; mp.set_start_method('forkserver')

或者,您可以从源重建numpy并使其链接到ATLAS或OpenBLAS而不是OSX Accelerate。 numpy开发人员正在开发二进制发行版,默认情况下包括ATLAS或OpenBLAS。

答案 1 :(得分:1)

这对我来说也很完美(升级有点拖累,但这是许多尝试的唯一修复,在我的情况下有效)。对于任何其他ipython笔记本用户,最好的方法是将其添加到笔记本配置中(尝试在笔记本中直接运行它会出错)。可以像这样添加命令:

# in ipython_notebook_config.py
c.IPKernelApp.exec_lines = ['import multiprocessing', 'multiprocessing.set_start_method("forkserver")']