Question

我已经通过向某些变量添加类型将python函数转换为cython等效函数。但是，cython函数产生的输出与原来的python函数略有不同。

我已经在这篇文章中了解了这种差异的一些原因 Cython: unsigned int indices for numpy arrays gives different result 但即使我在这篇文章中学到了什么，我仍然无法使用cython函数产生与python相同的结果。

所以我已经整理了4个函数来说明我的尝试。有人可以帮助揭开为什么我会为每个函数获得稍微不同的结果吗？以及如何获得一个返回与function1相同的精确值的cython函数？我在下面发表一些评论：

%%cython
import numpy as np
cimport numpy as np    

def function1(response, max_loc):    
    x, y = int(max_loc[0]), int(max_loc[1])

    tmp1 = (response[y,x+1] - response[y,x-1]) / 2*(response[y,x] - min(response[y,x-1], response[y,x+1]))
    tmp2 = (response[y,x+1] - response[y,x-1])
    tmp3 = 2*(response[y,x] - min(response[y,x-1], response[y,x+1]))

    print tmp1, tmp2, tmp3        
    return tmp1, tmp2, tmp3

cpdef function2(np.ndarray[np.float32_t, ndim=2] response, np.ndarray[np.float64_t, ndim=1] max_loc):
    cdef unsigned int x, y 
    x, y = int(max_loc[0]), int(max_loc[1])

    tmp1 = (response[y,x+1] - response[y,x-1]) / 2*(response[y,x] - min(response[y,x-1], response[y,x+1]))        
    tmp2 = (response[y,x+1] - response[y,x-1])
    tmp3 = 2*(response[y,x] - min(response[y,x-1], response[y,x+1]))     

    print tmp1, tmp2, tmp3        
    return tmp1, tmp2, tmp3


cpdef function3(np.ndarray[np.float32_t, ndim=2] response, np.ndarray[np.float64_t, ndim=1] max_loc):     
    cdef unsigned int x, y 
    x, y = int(max_loc[0]), int(max_loc[1])

    cdef np.float32_t tmp1, tmp2, tmp3
    cdef np.float32_t r1 =response[y,x+1]
    cdef np.float32_t r2 =response[y,x-1]
    cdef np.float32_t r3 =response[y,x]
    cdef np.float32_t r4 =response[y,x-1]
    cdef np.float32_t r5 =response[y,x+1]    

    tmp1 = (r1 - r2) / 2*(r3 - min(r4, r5))  
    tmp2 = (r1 - r2)
    tmp3 = 2*(r3 - min(r4, r5))

    print tmp1, tmp2, tmp3        
    return tmp1, tmp2, tmp3

def function4(response, max_loc):     
    x, y = int(max_loc[0]), int(max_loc[1])

    tmp1 = (float(response[y,x+1]) - response[y,x-1]) / 2*(float(response[y,x]) - min(response[y,x-1], response[y,x+1]))
    tmp2 = (float(response[y,x+1]) - response[y,x-1])
    tmp3 = 2*(float(response[y,x]) - min(response[y,x-1], response[y,x+1]))

    print tmp1, tmp2, tmp3        
    return tmp1, tmp2, tmp3

max_loc = np.asarray([ 15., 25.], dtype=np.float64) 
response = np.zeros((49,49), dtype=np.float32)     
x, y = int(max_loc[0]), int(max_loc[1])

response[y,x] = 0.959878861904  
response[y,x-1] = 0.438348740339
response[y,x+1] = 0.753262758255  

result1 = function1(response, max_loc)
result2 = function2(response, max_loc)
result3 = function3(response, max_loc)
result4 = function4(response, max_loc)
print result1
print result2
print result3
print result4

和结果：

0.0821185777156 0.314914 1.04306030273
0.082118573023 0.314914017916 1.04306024313
0.0821185708046 0.314914017916 1.04306030273
0.082118573023 0.314914017916 1.04306024313
(0.082118577715618812, 0.31491402, 1.043060302734375)
(0.08211857302303427, 0.3149140179157257, 1.0430602431297302)
(0.08211857080459595, 0.3149140179157257, 1.043060302734375)
(0.082118573023034269, 0.31491401791572571, 1.0430602431297302)

function1 表示我在原始python函数中执行的操作。 tmp1就是结果。

function2 是我的第一个cython版本，它产生的结果略有不同。显然，如果响应数组使用类型化变量unsigned int或int建立索引，则结果将被强制转换为double（使用PyFloat_FromDouble），即使数组的类型为np.float32_t也是如此。但是如果使用python int对数组进行索引，则使用函数PyObject_GetItem代替我，并获取np.float32_t，这是在function1中发生的。因此，function1中的表达式使用np.float32_t操作数计算，而function2中的表达式使用双精度计算。打印输出与function1相比略有不同。

function3 是我尝试获得与function1相同的输出的第二次cython尝试。这里我使用unsigned int索引来访问数组响应，但结果留在np.float32_t中间变量，然后我在计算中使用。结果略有不同。显然，print语句将使用PyFloat_FromDouble，因此无法打印np.float32_t。

然后我尝试更改python函数以匹配cython函数。 function4 试图通过在每个表达式中转换为float至少一个操作数来实现这一点，所以其余的操作数也被强制转换为python float，这是cython中的两倍，并且表达式是用双精度计算的，如在function2中。函数内部的打印与function2完全相同，但返回的值略有不同？！

Answer 1

让我们进行比较：

function1一直保持float32_t。
function2在编制索引后转换为float，使用float执行中间步骤，然后转换回float32_t以获得最终结果。
function3会转换为float，但会立即返回float32_t，然后执行中间步骤。
function4转换为float，执行中间步骤，然后将最终结果返回为float。

至于为什么function4打印与function2相同的东西，但返回不同的东西：如果你看一下类型，那很简单。这些值显然足够接近print以相同的方式发生，但与repr的方式相同。鉴于它们的类型不同，这并不奇怪。

Answer 2

如果你使用的是单精度浮点数，它只有7.225十进制数字的精度，我不会指望从强制到加倍的小变化很重要。

为了阐明您对function2的描述，如果您使用对象编制索引，Cython会使用PyObject_GetItem来获取np.float32标量对象（而非np.float32_t，这只是float C PyFloat_FromDouble的typedef。如果你改为直接索引到缓冲区，而Cython需要一个对象，它会调用tmp1。它需要对象来分配tmp2，tmp3和function3，因为它们不是键入的。

另一方面，在tmp中，您键入了float个变量，但仍需要创建ndarray个对象进行打印并返回结果。如果您使用NumPy function1（见下文），则不会出现此问题：

顺便提一句，在np.float64中，当您除以2时，将结果提升为>>> type(np.float32(1) / 2) <type 'numpy.float64'>。例如：

>>> type(np.float32(1) / np.float32(2))
<type 'numpy.float32'>

VS

float32

即使您确保def和cpdef函数中的所有操作都是function1，最终结果仍可能在编译的扩展模块中的两者之间有所不同。在以下示例中，我检查了np.float32中的中间结果是所有function2个对象。在生成的double的C中，我检查了没有强制转换为def function1(response, max_loc): tmp = np.zeros(3, dtype=np.float32) x, y = int(max_loc[0]), int(max_loc[1]) tmp[0] = (((response[y,x+1] - response[y,x-1]) / np.float32(2)) * (response[y,x] - min(response[y,x-1], response[y,x+1]))) tmp[1] = response[y,x+1] - response[y,x-1] tmp[2] = 2*(response[y,x] - min(response[y,x-1], response[y,x+1])) print tmp[0], tmp[1], tmp[2] return tmp cpdef function2(np.ndarray[np.float32_t, ndim=2] response, max_loc): cdef np.ndarray[np.float32_t, ndim=1] tmp = np.zeros(3, dtype=np.float32) cdef unsigned int x, y x, y = int(max_loc[0]), int(max_loc[1]) tmp[0] = (((response[y,x+1] - response[y,x-1]) / <np.float32_t>2) * (response[y,x] - min(response[y,x-1], response[y,x+1]))) tmp[1] = response[y,x+1] - response[y,x-1] tmp[2] = 2*(response[y,x] - min(response[y,x-1], response[y,x+1])) print tmp[int(0)], tmp[int(1)], tmp[int(2)] return tmp（或等效的typedef）。然而，这两个功能仍会产生略微不同的结果。我可能不得不深入研究已编译的程序集以找出原因，但也许我忽略了一些简单的事情。

>>> function1(response, max_loc)
0.0821186 0.314914 1.04306
array([ 0.08211858,  0.31491402,  1.0430603 ], dtype=float32)

>>> function2(response, max_loc)
0.0821186 0.314914 1.04306
array([ 0.08211857,  0.31491402,  1.0430603 ], dtype=float32)

比较：

{{1}}

cython函数输出与python函数输出略有不同

2 个答案: