如何通过定制的熊猫重量计算移动平均线?

时间:2018-08-24 16:12:55

标签: python pandas performance numpy for-loop

我有一个数据框,其中包含两列a: [1,2,3,4,5]; b: [1,0.4,0.3,0.5,0.2]。如何使c列如下:

c[0] = 1  
c[i] = c[i-1]*b[i]+a[i]*(1-b[i]) 

这样c:[1,1.6,2.58,3.29,4.658]

计算:

1 = 1
1*0.4+2*0.6 = 1.6
1.6*0.3+3*0.7 = 2.58
2.58*0.5+4*0.5 = 3.29
3.29*0.2+5*0.8 = 4.658

2 个答案:

答案 0 :(得分:1)

我看不到矢量化递归算法的方法。但是,您可以使用numba优化当前逻辑。这应该比常规循环更好。

from numba import jit

df = pd.DataFrame({'a': [1,2,3,4,5],
                   'b': [1,0.4,0.3,0.5,0.2]})

@jit(nopython=True)
def foo(a, b):
    c = np.zeros(a.shape)
    c[0] = 1
    for i in range(1, c.shape[0]):
        c[i] = c[i-1] * b[i] + a[i] * (1-b[i])
    return c

df['c'] = foo(df['a'].values, df['b'].values)

print(df)

   a    b      c
0  1  1.0  1.000
1  2  0.4  1.600
2  3  0.3  2.580
3  4  0.5  3.290
4  5  0.2  4.658

答案 1 :(得分:0)

可能有更聪明的方法,但这是我的尝试:

import pandas as pd

a = [1,2,3,4,5]
b = [1,0.4,0.3,0.5,0.2]

df = pd.DataFrame({'a':a , 'b': b})

for i in range(len(df)):
    if i is 0:
        df.loc[i,'c'] = 1
    else:
        df.loc[i,'c'] = df.loc[i-1,'c'] * df.loc[i,'b'] + df.loc[i,'a'] * (1 - df.loc[i,'b'])

输出:

   a    b      c
0  1  1.0  1.000
1  2  0.4  1.600
2  3  0.3  2.580
3  4  0.5  3.290
4  5  0.2  4.658