我试图加快使用Cython的日期时间之间的比较,当传递一个numpy数组的日期时间(或足以创建日期时间的细节)。首先,我试着看看Cython如何加速整数之间的比较。
testArrayInt = np.load("testArray.npy")
Python方法:
def processInt(array):
compareSuccess = 0#number that is greater than
testValue = 1#value to compare against
for counter in range(testArrayInt.shape[0]):
if testValue > testArrayInt[counter]:
compareSuccess+=1
print compareSuccess
Cython方法:
def processInt(np.ndarray[np.int_t,ndim=1] array):
cdef int rows = array.shape[0]
cdef int counter = 0
cdef int compareSuccess = 0
for counter in range(rows):
if testInt > array[counter]:
compareSuccess = compareSuccess+1
print compareSuccess
与numpy行1000000的时间比较是:
Python: 0.204969 seconds
Cython: 0.000826 seconds
Speedup: 250 times approx.
使用日期时间重复相同的练习: 由于cython不接受一个日期时间数组,我分裂并向这两种方法发送一组年,月和日数。
testArrayDateTime = np.load("testArrayDateTime.npy")
Python代码:
def processDateTime(array):
compareSuccess = 0
d = datetime(2009,1,1)#test datetime used to compare
rows = array.shape[0]
for counter in range(rows):
dTest = datetime(array[counter][0],array[counter][1],array[counter][2])
if d>dTest:
compareSuccess+=1
print compareSuccess
Cython代码:
from cpython.datetime cimport date
def processDateTime(np.ndarray[np.int_t, ndim=2] array):
cdef int compareSuccess = 0
cdef int rows = avlDates.shape[0]
cdef int counter = 0
for counter in range(rows):
dTest = date(array[counter,0],array[counter,1],array[counter,2])
if dTest>d:
compareSuccess=compareSuccess+1
print compareSuccess
性能:
Python: 0.865261 seconds
Cython: 0.162297 seconds
Speedup: 5 times approx.
为什么加速这么低?什么是增加这个的可能方法?
答案 0 :(得分:0)
您正在为每一行创建一个date
对象。这会花费更多时间,这不仅是因为您必须分配和取消分配内存,还因为它会对参数运行各种检查以确保它是有效日期。
要进行更快的比较,请使用整数比较来比较np.datetime64
数组,或者将整数分别比较年,月和日列。