熊猫根据当前日期查找过去 7 天的平均值

时间:2021-04-15 07:09:34

标签: python pandas

我有一个数据框

    df1:

    Date    Circle  Node    Data
    3/7/2021    DL  VLL-02  99.27673
    3/8/2021    DL  VLL-02  99.320505
    3/9/2021    DL  VLL-02  99.312279
    3/10/2021   DL  VLL-02  99.340827
    3/11/2021   DL  VLL-02  99.286441
    3/12/2021   DL  VLL-02  99.242238
    3/13/2021   DL  VLL-02  99.346517
    3/14/2021   DL  VLL-02  99.301205
    3/15/2021   DL  VLL-02  99.321673
    3/16/2021   DL  VLL-02  99.139963
    3/17/2021   DL  VLL-02  99.251693
   3/18/2021    DL  VLL-02  99.211517
   3/19/2021    DL  VLL-02  99.324249
   3/20/2021    DL  VLL-02  99.22366    
   3/21/2021    DL  VLL-02  99.237415
   3/22/2021    DL  VLL-02  99.20808

我想要的结果

平均值是根据最近 7 天的数据计算的 例如,第22天的数据平均值是从第21天到第15天计算的,即99.2443

df2:

    Date    Circle  Node    Data    Average
  3/15/2021 DL  VLL-02  99.321673   99.3071
  3/16/2021 DL  VLL-02  99.139963   99.3073
  3/17/2021 DL  VLL-02  99.251693   99.2826
  3/18/2021 DL  VLL-02  99.211517   99.2699
  3/19/2021 DL  VLL-02  99.324249   99.2529
  3/20/2021 DL  VLL-02  99.22366    99.2709
  3/21/2021 DL  VLL-02  99.237415   99.2534
  3/22/2021 DL  VLL-02  99.20808    99.2443

1 个答案:

答案 0 :(得分:0)

current_date = pd.to_datetime('2021-03-22')
past_date = current_date - pd.Timedelta(days=8)
for name, group_df in df1.groupby('Node'):
    sdf = group_df.sort_values(by=['Date'])
    odf = sdf[(sdf['Date'] > past_date) & (sdf['Date'] <= current_date)]
    odf['AVG'] = np.nan
    for index, row in odf.iterrows():
        opdate = row['Date'] - pd.Timedelta(days=7)
        tmp_df = sdf[(sdf['Date'] >= opdate) & (sdf['Date'] < row['Date'])]
        rft = tmp_df['Data'].mean()
        odf.set_value(index, 'AVG', rft)