与Cython中的穷举搜索并行

时间:2018-11-08 01:40:38

标签: python performance parallel-processing cython gil

我对Cython相当陌生,我正在尝试Cythonize我的一些代码。我有一个复杂值的3D数组X(我将其视为正方形数组的大“堆栈”),其形状为(small, small, huge),因此我需要找到最大对角线项目的位置和绝对值。我目前进行了详尽的搜索,如下所示:

cdef double complex[:,:,:] Xcp = X.copy()

cdef Py_ssize_t h = Xcp.shape[0]
cdef Py_ssize_t w = Xcp.shape[1]
cdef Py_ssize_t l = Xcp.shape[2]
cdef Py_ssize_t j, k, m

cdef double tmptop = 0.0
cdef Py_ssize_t[:] coords = np.zeros((3), dtype="intp")
cdef double it

for j in range(l):
    for k in range(w):
        for m in range(k+1, h):
            it = cabs(Xcp[m, k, j])
            if it > tmptop:
                tmptop = it
                coords[0] = m
                coords[1] = k
                coords[2] = j 

请注意,我从这里获得cabs

cdef extern from "complex.h":
    double cabs(double complex)

此代码已经比我之前在Numpy中运行的代码快很多,但是我确实认为它可以加快速度,尤其是并行化。

我尝试将循环更改为此:

with nogil:
    for j in prange(l):
        for k in range(w):
            for m in range(k+1, h):
                it = abs(Xcp[m, k, j])
                if it > tmptop:
                    tmptop = it
                    coords[0] = m
                    coords[1] = k
                    coords[2] = j

尽管我现在得到了错误的结果。这是怎么回事?

0 个答案:

没有答案