在cython中Numpy ndarray,运行缓慢。我做错了什么?

时间:2014-08-01 11:11:53

标签: numpy cython

该配置文件显示,在纯python中运行需要1.857秒,这需要大约5秒钟。距离我期待的速度提升还很远。

1.857秒内有3个函数调用 按顺序排列:内部时间

ncalls tottime percall cumtime percall filename:lineno(function) 1 1.853 1.853 1.853 1.853 {balCalc2.runLoans} 1 0.004 0.004 1.857 1.857:1() 1 0.000 0.000 0.000 0.000 {方法'禁用''_lsprof.Profiler'对象}

这是我正在运行的代码......

import numpy as np
cimport numpy as np
DTYPE = np.float64
ctypedef np.float64_t DTYPE_t
cimport cython
@cython.boundscheck(False)
@cython.wraparound(False)
@cython.nonecheck(False)

cdef double c2p(double upb):return 0.12
cdef double c2d(double upb):return 0.22
cdef double d2c(double upb):return 0.20
cdef double d2l(double upb):return 0.10
cdef double d2m(double upb):return 0.05
cdef double ficoDrift(double fico): return fico
cdef double genSeverity(char* state, double appraisal, double default, double count_dq): return 0.6
cdef double genWac(double CollType, double mod_age, double wac): return wac
cdef double genSchWac(double CollType, double wala, double wac): return wac
cdef double genAmort(double wac, double wam, double wala, double upb, double status):
    if status == 0.0:
        return 1.0
    else:
        return 0.0


cdef np.ndarray[DTYPE_t, ndim=1] genNextStat(double random_number, char* pooltype1, char* state, np.ndarray[DTYPE_t, ndim=1] arryLoans):
    cdef double CollType = arryLoans[0]
    cdef double period = arryLoans[1]
    cdef double upb = arryLoans[2]
    cdef double defer = arryLoans[3]
    cdef double sch_wac = arryLoans[4]
    cdef double wac = arryLoans[5]
    cdef double wam = arryLoans[6]
    cdef double wala = arryLoans[7]
    cdef double fico = arryLoans[8]
    cdef double appraisal = arryLoans[9]
    cdef double mba_stat = arryLoans[10]
    cdef double mod_stat = arryLoans[11]
    cdef double mod_age = arryLoans[12]
    cdef double count_c = arryLoans[13]
    cdef double count_dq = arryLoans[14]
    cdef double prepay = 0.0
    cdef double default = 0.0
    cdef double amort = 0.0
    cdef double loss = 0.0
    cdef double forgive = 0.0
    cdef double prob_p
    cdef double prop_d
    cdef double prob_c
    cdef double prob_m
    cdef double prob_l
    cdef np.ndarray[DTYPE_t, ndim=1] value

    period += 1.0
    wala += 1.0
    wam -= 1.0
    sch_wac = genSchWac(CollType, wala, wac)
    wac = genWac(CollType, mod_age, wala)
    amort = genAmort(wac, wam, wala, upb, mba_stat)
    upb -= amort

    prob_c = d2c(upb)
    prob_m = d2m(upb)
    prob_l = d2l(upb)
    prob_p = c2p(upb)
    prob_d = c2d(upb)

    #omit some operation here...

    value = np.array([CollType, period, upb, defer, sch_wac, wac, wam, wala, fico, appraisal, mba_stat, mod_stat, mod_age, count_c, count_dq, amort, prepay, default, loss, forgive])

    return value


def runLoans(np.ndarray[DTYPE_t, ndim=1] initLoan = np.array([10.0, 360.0, 10000.0, 50000.0, 6.0, 6.0, 350.0, 9.0, 600.0, 150000.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0])):
    cdef np.ndarray[DTYPE_t, ndim=2] loans = np.zeros((360000, 20))
    cdef size_t i
    cdef double rn = 0.53
    cdef char* pooltype = 'MBS'
    cdef char* prop_stat = 'CA'
    loans[0] = initLoan

    for i in range(1,360000):
        loans[i] = genNextStat(rn, pooltype, prop_stat, loans[i-1])

    return loans

好奇我应该如何提高速度......

2 个答案:

答案 0 :(得分:2)

你有没有运行cython -a?

这将生成一个html文件,其中行显示为调用python解释器的地方。

在您的代码上完成此操作后,我可以立即看到缓慢的位似乎是如何构建numpy数组以在函数genNextStat中返回。

我建议你找到一个更好的方法来做到这一点。当你摆脱黄色时,你会知道它是固定的!

请注意,最简单的方法是将 loan 传递给genNextStat方法和当前行,以便它只填充在循环外创建的数组行。

答案 1 :(得分:2)

以下行将创建一个列表并调用python函数:

value = np.array([CollType, period, upb, defer, sch_wac, wac, wam, wala, fico, appraisal, 
  mba_stat, mod_stat, mod_age, count_c, count_dq, amort, prepay, default, loss, forgive])

您可以将loans数组和索引传递给genNextStat(),然后让它直接填充数组。像这样:

cdef genNextStat(double random_number, char* pooltype1, char* state, np.ndarray[DTYPE_t, ndim=2] loans, int idx):
    cdef double CollType = loans[idx, 0]
    cdef double period = loans[idx, 1]
    cdef double upb = loans[idx, 2]
    cdef double defer = loans[idx, 3]
    cdef double sch_wac = loans[idx, 4]
    cdef double wac = loans[idx, 5]
    cdef double wam = loans[idx, 6]
    cdef double wala = loans[idx, 7]
    cdef double fico = loans[idx, 8]
    cdef double appraisal = loans[idx, 9]
    cdef double mba_stat = loans[idx, 10]
    cdef double mod_stat = loans[idx, 11]
    cdef double mod_age = loans[idx, 12]
    cdef double count_c = loans[idx, 13]
    cdef double count_dq = loans[idx, 14]

    #...

    idx += 1
    loans[idx, 0] = CollType
    loans[idx, 1] = period
    loans[idx, 2] = upb
    loans[idx, 3] = defer
    loans[idx, 4] = sch_wac
    loans[idx, 5] = wac
    loans[idx, 6] = wam
    loans[idx, 7] = wala
    loans[idx, 8] = fico
    loans[idx, 9] = appraisal
    loans[idx, 10] = mba_stat
    loans[idx, 11] = mod_stat
    loans[idx, 12] = mod_age
    loans[idx, 13] = count_c
    loans[idx, 14] = count_dq
    loans[idx, 15] = amort
    loans[idx, 16] = prepay
    loans[idx, 17] = default
    loans[idx, 18] = loss
    loans[idx, 19] = forgive

the code in `runLoans`:

    for i in range(1,360000):
        genNextStat(rn, pooltype, prop_stat, loans, i-1)