Pandas:柱的指数平滑功能

时间:2013-05-26 17:48:01

标签: python statistics pandas

我有以下DataFrame和交易数据:

df = pd.DataFrame({
'Trader': 'Carl Mark Carl Joe Mark Carl Max Max'.split(),
'Quantity': [5,2,5,10,1,5,2,1],
'Date' : [
DT.datetime(2013,1,1,13,0),
DT.datetime(2013,1,1,13,5),
DT.datetime(2013,2,5,20,0),
DT.datetime(2013,2,6,10,0),
DT.datetime(2013,2,8,12,0),                                      
DT.datetime(2013,3,7,14,0),
DT.datetime(2013,6,4,14,0),
DT.datetime(2013,7,4,14,0),
]})

df.index = [df.Date, df.Trader]

我希望计算每个交易者的平均订单量的每周统计数据。为此,我目前正在拆除交易者列并使用以下方式重新采样数据:

df.unstack('Trader').resample('1W', how='mean').fillna(0)

是否有可能为每个交易者编制一个具有交易量趋势函数的列(最好是基于交易者先前交易的指数平滑函数)?

由于

安迪

1 个答案:

答案 0 :(得分:8)

也许您正在寻找an exponentially weighted moving average

import pandas as pd
import datetime as DT

df = pd.DataFrame({
    'Trader': 'Carl Mark Carl Joe Mark Carl Max Max'.split(),
    'Quantity': [5, 2, 5, 10, 1, 5, 2, 1],
    'Date': [
        DT.datetime(2013, 1, 1, 13, 0),
        DT.datetime(2013, 1, 1, 13, 5),
        DT.datetime(2013, 2, 5, 20, 0),
        DT.datetime(2013, 2, 6, 10, 0),
        DT.datetime(2013, 2, 8, 12, 0),
        DT.datetime(2013, 3, 7, 14, 0),
        DT.datetime(2013, 6, 4, 14, 0),
        DT.datetime(2013, 7, 4, 14, 0),
        ]})

df.index = [df.Date, df.Trader]
df2 = df.unstack('Trader').resample('1W', how='mean').fillna(0)
print(df2.ewm(span=7).mean())

产量

            Quantity                              
Trader          Carl       Joe      Mark       Max
Date                                              
2013-01-06  5.000000  0.000000  2.000000  0.000000
2013-01-13  2.142857  0.000000  0.857143  0.000000
2013-01-20  1.216216  0.000000  0.486486  0.000000
2013-01-27  0.771429  0.000000  0.308571  0.000000
2013-02-03  0.518566  0.000000  0.207426  0.000000
2013-02-10  1.881497  3.041283  0.448470  0.000000
2013-02-17  1.338663  2.163837  0.319081  0.000000
2013-02-24  0.966766  1.562696  0.230437  0.000000
2013-03-03  0.705454  1.140307  0.168151  0.000000
2013-03-10  1.843158  0.838219  0.123605  0.000000
2013-03-17  1.362049  0.619423  0.091341  0.000000
2013-03-24  1.010398  0.459502  0.067759  0.000000
2013-03-31  0.751651  0.341831  0.050407  0.000000
2013-04-07  0.560329  0.254823  0.037576  0.000000
2013-04-14  0.418350  0.190254  0.028055  0.000000
2013-04-21  0.312703  0.142209  0.020970  0.000000
2013-04-28  0.233936  0.106388  0.015688  0.000000
2013-05-05  0.175120  0.079640  0.011744  0.000000
2013-05-12  0.131154  0.059645  0.008795  0.000000
2013-05-19  0.098261  0.044687  0.006590  0.000000
2013-05-26  0.073637  0.033488  0.004938  0.000000
2013-06-02  0.055195  0.025101  0.003701  0.000000
2013-06-09  0.041378  0.018818  0.002775  0.500670
2013-06-16  0.031023  0.014108  0.002080  0.375377
2013-06-23  0.023261  0.010579  0.001560  0.281462
2013-06-30  0.017443  0.007933  0.001170  0.211057
2013-07-07  0.013080  0.005949  0.000877  0.408376

将此结果与df2

连接起来
df3 = df2.ewm(span=7).mean()
df3.columns = pd.MultiIndex.from_tuples([('EWMA', item[1]) for item in df3.columns])
df2 = pd.concat([df2, df3], axis=1) 

print(df2)

产量

            Quantity                      EWMA                              
Trader          Carl  Joe  Mark  Max      Carl       Joe      Mark       Max
Date                                                                        
2013-01-06         5    0     2    0  5.000000  0.000000  2.000000  0.000000
2013-01-13         0    0     0    0  2.142857  0.000000  0.857143  0.000000
2013-01-20         0    0     0    0  1.216216  0.000000  0.486486  0.000000
2013-01-27         0    0     0    0  0.771429  0.000000  0.308571  0.000000
2013-02-03         0    0     0    0  0.518566  0.000000  0.207426  0.000000
2013-02-10         5   10     1    0  1.881497  3.041283  0.448470  0.000000
2013-02-17         0    0     0    0  1.338663  2.163837  0.319081  0.000000
2013-02-24         0    0     0    0  0.966766  1.562696  0.230437  0.000000
2013-03-03         0    0     0    0  0.705454  1.140307  0.168151  0.000000
2013-03-10         5    0     0    0  1.843158  0.838219  0.123605  0.000000
2013-03-17         0    0     0    0  1.362049  0.619423  0.091341  0.000000
2013-03-24         0    0     0    0  1.010398  0.459502  0.067759  0.000000
2013-03-31         0    0     0    0  0.751651  0.341831  0.050407  0.000000
2013-04-07         0    0     0    0  0.560329  0.254823  0.037576  0.000000
2013-04-14         0    0     0    0  0.418350  0.190254  0.028055  0.000000
2013-04-21         0    0     0    0  0.312703  0.142209  0.020970  0.000000
2013-04-28         0    0     0    0  0.233936  0.106388  0.015688  0.000000
2013-05-05         0    0     0    0  0.175120  0.079640  0.011744  0.000000
2013-05-12         0    0     0    0  0.131154  0.059645  0.008795  0.000000
2013-05-19         0    0     0    0  0.098261  0.044687  0.006590  0.000000
2013-05-26         0    0     0    0  0.073637  0.033488  0.004938  0.000000
2013-06-02         0    0     0    0  0.055195  0.025101  0.003701  0.000000
2013-06-09         0    0     0    2  0.041378  0.018818  0.002775  0.500670
2013-06-16         0    0     0    0  0.031023  0.014108  0.002080  0.375377
2013-06-23         0    0     0    0  0.023261  0.010579  0.001560  0.281462
2013-06-30         0    0     0    0  0.017443  0.007933  0.001170  0.211057
2013-07-07         0    0     0    1  0.013080  0.005949  0.000877  0.408376