我有一个数据框
df1:
Date Circle Node Data
3/7/2021 DL VLL-02 99.27673
3/8/2021 DL VLL-02 99.320505
3/9/2021 DL VLL-02 99.312279
3/10/2021 DL VLL-02 99.340827
3/11/2021 DL VLL-02 99.286441
3/12/2021 DL VLL-02 99.242238
3/13/2021 DL VLL-02 99.346517
3/14/2021 DL VLL-02 99.301205
3/15/2021 DL VLL-02 99.321673
3/16/2021 DL VLL-02 99.139963
3/17/2021 DL VLL-02 99.251693
3/18/2021 DL VLL-02 99.211517
3/19/2021 DL VLL-02 99.324249
3/20/2021 DL VLL-02 99.22366
3/21/2021 DL VLL-02 99.237415
3/22/2021 DL VLL-02 99.20808
我想要的结果
平均值是根据最近 7 天的数据计算的 例如,第22天的数据平均值是从第21天到第15天计算的,即99.2443
df2:
Date Circle Node Data Average
3/15/2021 DL VLL-02 99.321673 99.3071
3/16/2021 DL VLL-02 99.139963 99.3073
3/17/2021 DL VLL-02 99.251693 99.2826
3/18/2021 DL VLL-02 99.211517 99.2699
3/19/2021 DL VLL-02 99.324249 99.2529
3/20/2021 DL VLL-02 99.22366 99.2709
3/21/2021 DL VLL-02 99.237415 99.2534
3/22/2021 DL VLL-02 99.20808 99.2443
答案 0 :(得分:0)
current_date = pd.to_datetime('2021-03-22')
past_date = current_date - pd.Timedelta(days=8)
for name, group_df in df1.groupby('Node'):
sdf = group_df.sort_values(by=['Date'])
odf = sdf[(sdf['Date'] > past_date) & (sdf['Date'] <= current_date)]
odf['AVG'] = np.nan
for index, row in odf.iterrows():
opdate = row['Date'] - pd.Timedelta(days=7)
tmp_df = sdf[(sdf['Date'] >= opdate) & (sdf['Date'] < row['Date'])]
rft = tmp_df['Data'].mean()
odf.set_value(index, 'AVG', rft)