我正在尝试优化我的cython代码,这里似乎有很大的改进空间,这是IPython笔记本中%prun扩展的配置文件的一部分:
7016695 function calls in 18.475 seconds
Ordered by: internal time
ncalls tottime percall cumtime percall filename:lineno(function)
400722 7.723 0.000 15.086 0.000 _methods.py:73(_var)
814815 4.190 0.000 4.190 0.000 {method 'reduce' of 'numpy.ufunc' objects}
1 1.855 1.855 18.475 18.475 {_cython_magic_aed83b9d1a706200aa6cef0b7577cf41.knn_alg}
403683 0.838 0.000 1.047 0.000 _methods.py:39(_count_reduce_items)
813031 0.782 0.000 0.782 0.000 {numpy.core.multiarray.array}
398748 0.611 0.000 15.485 0.000 fromnumeric.py:2819(var)
804405 0.556 0.000 1.327 0.000 numeric.py:462(asanyarray)
看到我的程序花费差不多8秒才计算方差我希望能够加速
我使用1d数组长度为404~1000次的np.var()来计算方差。我检查了C标准库,遗憾的是没有这个功能,我不想在C中编写自己的功能。
1.还有其他选择吗?
2.可以减少列表中第二项所花费的时间吗?
这是我的代码,如果它有助于查看:
cpdef knn_alg(np.ndarray[double, ndim=2] temp, np.ndarray[double, ndim=1] jan1, int L, int w, int B):
cdef np.ndarray[double, ndim=3] lnn = np.zeros((L+1,temp.shape[1],365))
lnn = lnn_alg(temp, L, w)
cdef np.ndarray[double, ndim=2] sim = np.zeros((len(temp),temp.shape[1]))
cdef np.ndarray [double, ndim=2] a = np.zeros((L+1,lnn.shape[1]))
cdef int b
cdef np.ndarray [double, ndim=2] c = np.zeros((L,lnn.shape[1]-3))
cdef np.ndarray [double, ndim=2] lnn_scale = np.zeros((L,lnn.shape[1]))
cdef np.ndarray [double, ndim=2] cov_t = np.zeros((3,3))
cdef np.ndarray [double, ndim=2] dk = np.zeros((L,4))
cdef int random_selection
cdef np.ndarray [double, ndim=1] day_month
cdef int day_of_year
cdef np.ndarray [double, ndim=2] lnn_scaled
cdef np.ndarray [double, ndim=2] temp_scaled
cdef np.ndarray [double, ndim=2] eig_vec
cdef double PC_t
cdef np.ndarray [double, ndim=1] PC_l
cdef double K
cdef np.ndarray[double, ndim=2] knn
cdef np.ndarray[double, ndim=1] val
cdef np.ndarray[double, ndim=1] pn
cdef double rand_num
cdef int nn
cdef int index
cdef int inc
cdef int i
sim[0,:] = jan1
for i in xrange(1,len(temp),B):
#If leap day then randomly select feb 28 or mar 31
if (temp[i,4]==2) & (temp[i,3]==29):
random_selection = np.random.randint(0,1)
day_month = np.array([[29,2],[1,3]])[random_selection]
else:
day_month = temp[i,3:5]
#Convert day month to day of year for L+1 nearest neighbors selection
current = datetime.datetime(2014, (<int>day_month[1]), (<int>day_month[0]))
day_of_year = current.timetuple().tm_yday - 1
#Take out current day from L+1 nearest neighbors
a = lnn[:,:,day_of_year]
b = np.where((a[:,3:6] == temp[i,3:6]).all(axis=-1))[0][0]
c = np.delete(a,(b), axis=0)
#Scale and center data from nearest neighbors and spatially averaged historical data
lnn_scaled = scale(c[:,0:3])
temp_scaled = scale(temp[:,0:3])
#Calculate covariance matrix of nearest neighbors
cov_t[:,:] = np.cov(lnn_scaled.T)
#Calculate eigenvalues and vectors of covariance matrix
eig_vec = eig(cov_t)[1]
#Calculate principal components of scaled L nearest neighbors and
PC_t = np.dot(temp_scaled[i],eig_vec[0])
PC_l = np.dot(lnn_scaled,eig_vec[0])
#Calculate mahalonobis distance
dk = np.zeros((404,4))
dk[:,0] = np.array([sqrt((PC_t-pc)**2/np.var(PC_l)) for pc in PC_l])
dk[:,1:4] = c[:,3:6]
#Extract K nearest neighbors
dk = dk[dk[:,0].argsort()]
K = round(sqrt(L),0)
knn = dk[0:(<int>K)]
#Create probility density function
val = np.array([1.0/k for k in range(1,len(knn)+1)])
wk = val/(<int>val.sum())
pn = wk.cumsum()
#Select next days value from KNNs using probability density function with random value
rand_num = np.random.rand(1)[0]
nn = (abs(pn-rand_num)).argmin()
index = np.where((temp[:,3:6] == knn[nn,1:4]).all(axis=-1))[0][0]
if i+B > len(temp):
inc = len(temp) - i
else:
inc = B
if (index+B > len(temp)):
index = len(temp)-B
sim[i:i+inc,:] = temp[index:index+inc,:]
return sim
方差计算在这一行:
dk[:,0] = np.array([sqrt((PC_t-pc)**2/np.var(PC_l)) for pc in PC_l])
任何建议都非常有用,因为我对cython
很新答案 0 :(得分:3)
我经历了计算,我认为它变得如此缓慢的原因是我使用的是np.var()这是一个python(或numpy)函数,并且不允许循环在C中编译。如果有人知道如何使用numpy,请告诉我。
我最终做的是编写计算结果:
dk[:,0] = np.array([sqrt((PC_t-pc)**2/np.var(PC_l)) for pc in PC_l])
作为一个单独的函数:
cimport cython
cimport numpy as np
import numpy as np
from libc.math cimport sqrt as csqrt
from libc.math cimport pow as cpow
@cython.boundscheck(False)
@cython.cdivision(True)
cdef cy_mahalanobis(np.ndarray[double, ndim=1] PC_l, double PC_t):
cdef unsigned int i,j,L
L = PC_l.shape[0]
cdef np.ndarray[double] dk = np.zeros(L)
cdef double x,total,mean,var
total = 0
for i in xrange(L):
x = PC_l[i]
total = total + x
mean = total / L
total = 0
for i in xrange(L):
x = cpow(PC_l[i]-mean,2)
total = total + x
var = total / L
for j in xrange(L):
dk[j] = csqrt(cpow(PC_t-PC_l[j],2)/var)
return dk
因为我没有调用任何任何python函数(包括numpy),所以整个循环能够用C编译(当使用注释选项cython -a file.pyx
或%%cython -a
进行Ipython时没有黄线笔记本)。
总的来说,我的代码最终速度提高了一个数量级!值得努力编写这个手工!我的cython(和python)并不是最好的,所以任何额外的建议或答案都会受到赞赏。
答案 1 :(得分:1)
确保您的for
循环
dk[:,0] = np.array([sqrt((PC_t-pc)**2/np.var(PC_l)) for pc in PC_l])
。查看此link到Cython文档。
如果没有,可能有助于确保将pc
声明为cdef类型以确保不引用任何python对象。 (Another link to the docs)