我是Cython的新手。为什么我的C函数Numeraire
在这一点上只包含一个内置函数,比直接调用内置函数慢得多?
感谢。这是Cython代码(backward.pyx
)代码:
import numpy as np
cimport numpy as np
from libc.math cimport exp
cdef double Numeraire(int i1, int i0, np.ndarray[np.int_t, ndim=1] j):
cdef float rate = 0.05
return exp(-rate/12*(i1 - i0))
def Slow(np.ndarray[np.float_t, ndim=2] values, int i1, int i0):
cdef float norm = 0.25
cdef int i, j0, j1
cdef np.ndarray[np.int_t, ndim=1] j = np.empty(2, dtype=np.int)
for i in range(i1-1, i0-1, -1):
for j0 in range(i+1):
j[0] = j0
for j1 in range(i+1):
j[1] = j1
values[j0, j1] += (
values[j0+1, j1 ] +
values[j0 , j1+1] +
values[j0+1, j1+1])
values[j0, j1] *= norm*Numeraire(i+1, i, j) #4.397s (!)
def Fast(np.ndarray[np.float_t, ndim=2] values, int i1, int i0):
cdef float norm = 0.25
cdef int i, j0, j1
cdef np.ndarray[np.int_t, ndim=1] j = np.empty(2, dtype=np.int)
for i in range(i1-1, i0-1, -1):
for j0 in range(i+1):
j[0] = j0
for j1 in range(i+1):
j[1] = j1
values[j0, j1] += (
values[j0+1, j1 ] +
values[j0 , j1+1] +
values[j0+1, j1+1])
values[j0, j1] *= norm*exp(-0.05/12*((i+1) - i)) #0.327s
这是时间信息:
In [1]: import numpy as np
In [2]: import backward
In [3]: factors=2
In [4]: i=360
In [5]: %timeit backward.Fast(np.ones([i+1]*factors), i, 0)
10 loops, best of 3: 104 ms per loop
In [6]: %timeit backward.Slow(np.ones([i+1]*factors), i, 0)
1 loops, best of 3: 4.67 s per loop
答案 0 :(得分:2)
它与您ndarray
传递给Numeraire而不使用相关。如果您运行cython -a backward.pyx
并查看您首先看到的代码,cdef double Numeraire...
行是高亮淡黄色(显示Cython正在那里进行隐藏工作),当您单击该行时,您将获得以下代码
static double __pyx_f_8backward_Numeraire(int __pyx_v_i1, int __pyx_v_i0, CYTHON_UNUSED PyArrayObject *__pyx_v_j) {
float __pyx_v_rate;
__Pyx_LocalBuf_ND __pyx_pybuffernd_j;
__Pyx_Buffer __pyx_pybuffer_j;
double __pyx_r;
__Pyx_RefNannyDeclarations
__Pyx_RefNannySetupContext("Numeraire", 0);
__pyx_pybuffer_j.pybuffer.buf = NULL;
__pyx_pybuffer_j.refcount = 0;
__pyx_pybuffernd_j.data = NULL;
__pyx_pybuffernd_j.rcbuffer = &__pyx_pybuffer_j;
{
__Pyx_BufFmt_StackElem __pyx_stack[1];
if (unlikely(__Pyx_GetBufferAndValidate(&__pyx_pybuffernd_j.rcbuffer->pybuffer, (PyObject*)__pyx_v_j, &__Pyx_TypeInfo_nn___pyx_t_5numpy_int_t, PyBUF_FORMAT| PyBUF_STRIDES, 1, 0, __pyx_stack) == -1)) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 9; __pyx_clineno = __LINE__; goto __pyx_L1_error;}
}
__pyx_pybuffernd_j.diminfo[0].strides = __pyx_pybuffernd_j.rcbuffer->pybuffer.strides[0]; __pyx_pybuffernd_j.diminfo[0].shape = __pyx_pybuffernd_j.rcbuffer->pybuffer.shape[0];
/* … */
/* function exit code */
__pyx_L1_error:;
{ PyObject *__pyx_type, *__pyx_value, *__pyx_tb;
__Pyx_ErrFetch(&__pyx_type, &__pyx_value, &__pyx_tb);
__Pyx_SafeReleaseBuffer(&__pyx_pybuffernd_j.rcbuffer->pybuffer);
__Pyx_ErrRestore(__pyx_type, __pyx_value, __pyx_tb);}
__Pyx_WriteUnraisable("backward.Numeraire", __pyx_clineno, __pyx_lineno, __pyx_filename, 0);
__pyx_r = 0;
goto __pyx_L2;
__pyx_L0:;
__pyx_L2:;
__Pyx_RefNannyFinishContext();
return __pyx_r;
}
其中函数的主体位于标记为/* … */
的位。
其中一些工作适用于每次Cython调用,但相当一部分与您未使用的ndarray相关,j
(例如__pyx_pybuffer_j
,__pyx_pybuffernd_j
)
如果从参数列表中删除j
,则在有和没有函数调用的情况下速度相同。如果你真的需要j
这个函数的非平凡非示例版本,那么有很多选项。
如果你总是知道' j'将是长度2,你可能只有
cdef double Numeraire(int i1, int i0, double j0, double j1):
或者你可以传递一个C风格double*
,一个长度,可能是一个步幅(但如果你将j
声明为cdef ndarray[...,mode="c"]
你不需要那可能会更快。
最佳选择:最简单的选择是使用new-style Cython typed memoryview interface代替ndarray
界面。
代码:
cdef double Numeraire(int i1, int i0, long[::1] j):
# code as before
# then within your calling function
# ...
cdef long[::1] j = np.empty(2, dtype=np.int)
# ...
在这种情况下,这似乎几乎是免费开销(但是,在其他一些情况下,我发现内存视图界面的分数(~1%)较慢,因此它始终不是最好的答案)。