如何使这种舍入功能更快?

时间:2014-11-19 06:57:39

标签: python multithreading numpy multiprocessing numba

我正在尝试编写一个函数,将值从列表中舍入到列表中最近的有效赔率:     https://api.developer.betfair.com/services/webapps/docs/display/1smk3cen4v3lu3yomq5qye0ni/Betfair+Price+Increments

我的代码在这里:

def nclosest_valid_odds( x ):
    """
    https://api.developer.betfair.com/services/webapps/docs/display/1smk3cen4v3lu3yomq5qye0ni/Betfair+Price+Increments
    """

    r = np.empty_like( x )

    r[ x<1.0 ] = np.nan

    bidx      = (1.0<=x) & (x<=2.0)
    r[ bidx ] = 0.01 * np.round( x[ bidx ] / 0.01 )

    bidx      = (2.0<x) & (x<=3.0 )
    r[ bidx ] = 0.02 * np.round( x[ bidx ] / 0.02 )    

    bidx      = (3.0<x) & (x<=4.0)
    r[ bidx ] = 0.05 * np.round( x[ bidx ] / 0.05 )

    bidx      = (4.0<x) & (x<=6.0)
    r[ bidx ] = 0.1 * np.round( x[ bidx ] / 0.1 )

    bidx      = (6.0<x) & (x<=10.0)
    r[ bidx ] = 0.2 * np.round( x[ bidx ] / 0.2 )

    bidx      = (10.0<x) & (x<=20.0)
    r[ bidx ] = 0.5 * np.round( x[ bidx ] / 0.5 )

    bidx      = (20.0<x) & (x<=30.0)
    r[ bidx ] = np.round( x[ bidx ] )

    bidx      = (30.0<x) & (x<=50.0)
    r[ bidx ] = 2.0 * np.round( x[ bidx ] / 2.0 )

    bidx      = (50.0<x) & (x<=100.0)
    r[ bidx ] = 5.0 * np.round( x[ bidx ] / 5.0 )

    bidx      = (100.0<x) & (x<=1000)
    r[ bidx ] = 10.0 * np.round( x[ bidx ] / 10.0 )

    return r

使用numba的地板版本在这里:

def floor_closest_valid_odds( x ):

    r = np.zeros_like( x )

    for i in range( len( r ) ):
        if x[i]<=1.0:
            r[i] = np.nan
        elif x[i]<=2.0:            
            r[i] = 0.01 * np.floor( x[i] / 0.01 )            
        elif x[i]<=3.0:                        
            r[i] = 0.02 * np.floor( x[i] / 0.02 )            
        elif x[i]<=4.0:            
            r[i] = 0.05 * np.floor( x[i] / 0.05 )
        elif x[i]<=6.0:            
            r[i] = 0.1 * np.floor( x[i] / 0.1 )
        elif x[i]<=10.0:            
            r[i] = 0.5 * np.floor( x[i] / 0.5 )
        elif x[i]<=20.0:            
            r[i] = 1.0 * np.floor( x[i] / 1.0 )
        elif x[i]<=30.0:            
            r[i] = 2.0 * np.floor( x[i] / 2.0 )
        elif x[i]<=50.0:            
            r[i] = 2.0 * np.floor( x[i] / 2.0 )
        elif x[i]<=100.0:            
            r[i] = 5.0 * np.floor( x[i] / 5.0 )
        elif x[i]<=1000.0:            
            r[i] = 2.0 * np.floor( x[i] / 2.0 )
        else:            
            r[i] = 1000.0
    return r

jfloor_closest_valid_odds = autojit( floor_closest_valid_odds )

我用这个代码计算代码:

    x = np.random.randn( 1000000 )

    with Timer( 'nclosest_odds' ):
        r = nclosest_valid_odds( x )

    r =helpers.jfloor_closest_valid_odds( x )    
    with Timer( 'jfloor_closest_valid_odds' ):
        r = helpers.jfloor_closest_valid_odds( x )                 

我机器上的计时:

nclosest odds : 0.06 seconds
jfloor_closest_odds : 0.01 seconds

如何使用numpy和/或numba来加速代码?

解决方案:

我发现http://numba.pydata.org/numba-doc/dev/examples.html的多线程示例可以转换为矢量化函数。使用它可以在我的2核笔记本电脑上获得最佳性能。

Numba的Vectorize功能也可以,但不太好。

我已将矢量化程序代码上传到github:https://github.com/jsphon/MTVectorizer

时序比较图表如下。 x轴表示输入数组的大小。 y轴代表时间。

此图表来自双核笔记本电脑。

Timing Comparisons

此图表来自带有i7 920 CPU的桌面。它有8个核心。

enter image description here

1 个答案:

答案 0 :(得分:2)

你可以通过创建一个幅度数组然后在最后用幅度数组进行舍入来获得numpy的速度提升。

def nclosest_valid_odds_3( x ):

    magnitudes = np.empty_like(x)

    magnitudes[x < 1] = np.nan
    magnitudes[(1 <= x) & (x <= 2)] = 0.01

    v = [2, 3, 4, 6, 10, 20, 30, 50, 100, 1000]
    m = [0.02, 0.05, 0.1, 0.2, 0.5, 1, 2, 5, 10]
    for low, high, mag in zip(v, v[1:], m):
        magnitudes[(low < x) & (x <= high)] = mag

    return magnitudes * np.round(x / magnitudes)