Arnaud Legoux移动平均线和numpy

时间:2017-10-28 13:11:20

标签: python pandas numpy vectorization

我想编写使用NumPy(或Pandas)计算Arnaud Legoux移动平均线的矢量版代码。你能帮我解决这个问题吗?感谢。

非矢量版看起来如下(见下文)。

def NPALMA(pnp_array, **kwargs) :
    '''
    ALMA - Arnaud Legoux Moving Average,
    http://www.financial-hacker.com/trend-delusion-or-reality/
    https://github.com/darwinsys/Trading_Strategies/blob/master/ML/Features.py
    '''
    length = kwargs['length']
    # just some number (6.0 is useful)
    sigma = kwargs['sigma']
    # sensisitivity (close to 1) or smoothness (close to 0)
    offset = kwargs['offset']

    asize = length - 1
    m = offset * asize
    s = length  / sigma
    dss = 2 * s * s

    alma = np.zeros(pnp_array.shape)
    wtd_sum = np.zeros(pnp_array.shape)

    for l in range(len(pnp_array)):
        if l >= asize:
            for i in range(length):
                im = i - m
                wtd = np.exp( -(im * im) / dss)
                alma[l] += pnp_array[l - length + i] * wtd
                wtd_sum[l] += wtd
            alma[l] = alma[l] / wtd_sum[l]

    return alma

1 个答案:

答案 0 :(得分:2)

开始方法

我们可以沿第一个轴创建滑动窗口,然后使用张量乘法和wtd值的范围进行求和。

实现看起来像这样 -

# Get all wtd values in an array
wtds = np.exp(-(np.arange(length) - m)**2/dss)

# Get the sliding windows for input array along first axis
pnp_array3D = strided_axis0(pnp_array,len(wtds))

# Initialize o/p array
out = np.zeros(pnp_array.shape)

# Get sum-reductions for the windows which don't need wrapping over
out[length:] = np.tensordot(pnp_array3D,wtds,axes=((1),(0)))[:-1]

# Last element of the output needed wrapping. So, do it separately.
out[length-1] = wtds.dot(pnp_array[np.r_[-1,range(length-1)]])

# Finally perform the divisions
out /= wtds.sum()

获取滑动窗口的功能:strided_axis0来自here

使用1D卷积提升

那些乘以wtds值然后它们的和减少的乘法基本上是沿第一轴的卷积。因此,我们可以axis=0使用from scipy.ndimage import convolve1d as conv avgs = conv(pnp_array, weights=wtds/wtds.sum(),axis=0, mode='wrap') 。考虑到内存效率,这会更快,因为我们不会创建巨大的滑动窗口。

实施将是 -

out[length-1:]

因此,avgs[:-length+1],即非零行将与wtds相同。

如果我们使用来自convolution的非常小的内核数,可能会有一些精确的差异。因此,如果使用此def original_app(pnp_array, length, m, dss): alma = np.zeros(pnp_array.shape) wtd_sum = np.zeros(pnp_array.shape) for l in range(len(pnp_array)): if l >= asize: for i in range(length): im = i - m wtd = np.exp( -(im * im) / dss) alma[l] += pnp_array[l - length + i] * wtd wtd_sum[l] += wtd alma[l] = alma[l] / wtd_sum[l] return alma def vectorized_app1(pnp_array, length, m, dss): wtds = np.exp(-(np.arange(length) - m)**2/dss) pnp_array3D = strided_axis0(pnp_array,len(wtds)) out = np.zeros(pnp_array.shape) out[length:] = np.tensordot(pnp_array3D,wtds,axes=((1),(0)))[:-1] out[length-1] = wtds.dot(pnp_array[np.r_[-1,range(length-1)]]) out /= wtds.sum() return out def vectorized_app2(pnp_array, length, m, dss): wtds = np.exp(-(np.arange(length) - m)**2/dss) return conv(pnp_array, weights=wtds/wtds.sum(),axis=0, mode='wrap') 方法,请记住这一点。

运行时测试

方法 -

In [470]: np.random.seed(0)
     ...: m,n = 1000,100
     ...: pnp_array = np.random.rand(m,n)
     ...: 
     ...: length = 6
     ...: sigma = 0.3
     ...: offset = 0.5
     ...: 
     ...: asize = length - 1
     ...: m = np.floor(offset * asize)
     ...: s = length  / sigma
     ...: dss = 2 * s * s
     ...: 

In [471]: %timeit original_app(pnp_array, length, m, dss)
     ...: %timeit vectorized_app1(pnp_array, length, m, dss)
     ...: %timeit vectorized_app2(pnp_array, length, m, dss)
     ...: 
10 loops, best of 3: 36.1 ms per loop
1000 loops, best of 3: 1.84 ms per loop
1000 loops, best of 3: 684 µs per loop

In [472]: np.random.seed(0)
     ...: m,n = 10000,1000 # rest same as previous one

In [473]: %timeit original_app(pnp_array, length, m, dss)
     ...: %timeit vectorized_app1(pnp_array, length, m, dss)
     ...: %timeit vectorized_app2(pnp_array, length, m, dss)
     ...: 
1 loop, best of 3: 503 ms per loop
1 loop, best of 3: 222 ms per loop
10 loops, best of 3: 106 ms per loop

计时 -

{{1}}