我正在对来自http://docs.cython.org/src/tutorial/numpy.html的素数生成器的变体进行一些性能测试。 以下性能测量值为kmax = 1000
纯Python实现,在CPython中运行:0.15s
纯Python实现,在Cython中运行:0.07s
def primes(kmax):
p = []
k = 0
n = 2
while k < kmax:
i = 0
while i < k and n % p[i] != 0:
i = i + 1
if i == k:
p.append(n)
k = k + 1
n = n + 1
return p
Pure Python + Numpy实现,在CPython中运行:1.25s
import numpy
def primes(kmax):
p = numpy.empty(kmax, dtype=int)
k = 0
n = 2
while k < kmax:
i = 0
while i < k and n % p[i] != 0:
i = i + 1
if i == k:
p[k] = n
k = k + 1
n = n + 1
return p
使用int *:0.003s
的Cython实现from libc.stdlib cimport malloc, free
def primes(int kmax):
cdef int n, k, i
cdef int *p = <int *>malloc(kmax * sizeof(int))
result = []
k = 0
n = 2
while k < kmax:
i = 0
while i < k and n % p[i] != 0:
i = i + 1
if i == k:
p[k] = n
k = k + 1
result.append(n)
n = n + 1
free(p)
return result
以上表现很好但看起来很糟糕,因为它拥有两份数据......所以我尝试重新实现它:
Cython + Numpy:1.01s
import numpy as np
cimport numpy as np
cimport cython
DTYPE = np.int
ctypedef np.int_t DTYPE_t
@cython.boundscheck(False)
def primes(DTYPE_t kmax):
cdef DTYPE_t n, k, i
cdef np.ndarray p = np.empty(kmax, dtype=DTYPE)
k = 0
n = 2
while k < kmax:
i = 0
while i < k and n % p[i] != 0:
i = i + 1
if i == k:
p[k] = n
k = k + 1
n = n + 1
return p
问题:
如何将numpy数组转换为int *?以下不起作用
cdef numpy.nparray a = numpy.zeros(100, dtype=int)
cdef int * p = <int *>a.data
答案 0 :(得分:9)
cdef DTYPE_t [:] p_view = p
在计算中使用此代替p。对我来说,运行时间从 580 ms 减少到 2.8 ms 。关于与使用* int的实现完全相同的运行时。这就是你可以期待的最大值。
DTYPE = np.int
ctypedef np.int_t DTYPE_t
@cython.boundscheck(False)
def primes(DTYPE_t kmax):
cdef DTYPE_t n, k, i
cdef np.ndarray p = np.empty(kmax, dtype=DTYPE)
cdef DTYPE_t [:] p_view = p
k = 0
n = 2
while k < kmax:
i = 0
while i < k and n % p_view[i] != 0:
i = i + 1
if i == k:
p_view[k] = n
k = k + 1
n = n + 1
return p
答案 1 :(得分:5)
为什么在CPython上运行时numpy数组比python列表慢得多?
因为您没有完全输入它。使用
cdef np.ndarray[dtype=np.int, ndim=1] p = np.empty(kmax, dtype=DTYPE)
如何将numpy数组转换为int *?
使用np.intc
作为dtype,而不是np.int
(这是C long
)。也就是说&#39; S
cdef np.ndarray[dtype=int, ndim=1] p = np.empty(kmax, dtype=np.intc)
(但实际上,使用memoryview,他们更清洁,从长远来看,Cython人们想要摆脱NumPy数组语法。)
答案 2 :(得分:1)
到目前为止我找到的最佳语法:
import numpy
cimport numpy
cimport cython
@cython.boundscheck(False)
@cython.wraparound(False)
def primes(int kmax):
cdef int n, k, i
cdef numpy.ndarray[int] p = numpy.empty(kmax, dtype=numpy.int32)
k = 0
n = 2
while k < kmax:
i = 0
while i < k and n % p[i] != 0:
i = i + 1
if i == k:
p[k] = n
k = k + 1
n = n + 1
return p
注意我在哪里使用numpy.int32而不是int。 cdef左侧的任何内容都是C类型(因此int = int32和float = float32),而它右侧(或cdef外部)的任何内容都是python类型(int = int64和float = float64) )