这是我的df:ts是时间戳,索引。 x1是值
x1
ts
2017-09-01 17:22:42 7.0
2017-09-01 17:22:53 11.0
2017-09-01 17:23:04 9.0
2017-09-02 17:23:15 15.0
2017-09-03 17:23:26 13.0
2017-09-03 17:23:38 19.0
2017-09-03 17:23:49 13.0
2017-09-04 17:24:00 15.0
我想要一个等于昨日平均值+今天平均值的列:
x1 result
ts
2017-09-01 17:22:42 7.0 (7+11+9) /3
2017-09-01 17:22:53 11.0 (7+11+9) /3
2017-09-01 17:23:04 9.0 (7+11+9) /3
2017-09-02 17:23:15 15.0 (7+11+9) /3 + 15/1
2017-09-03 17:23:26 13.0 15/1 + (13+19+13)/3
2017-09-03 17:23:38 19.0 15/1 + (13+19+13)/3
2017-09-03 17:23:49 13.0 15/1 + (13+19+13)/3
2017-09-04 17:24:00 15.0 15/1 + (13+19+13)/3
如果没有昨天的数据,则使用0
答案 0 :(得分:6)
使用pd.merge_asof
,pd.DataFrame.resample
和pd.DataFrame.rolling
pd.merge_asof(
df,
df.resample('D').mean().rolling(2, 1).sum().rename(columns={'x1': 'result'}),
left_index=True, right_index=True
)
x1 result
ts
2017-09-01 17:22:42 7.0 9.0
2017-09-01 17:22:53 11.0 9.0
2017-09-01 17:23:04 9.0 9.0
2017-09-02 17:23:15 15.0 24.0
2017-09-03 17:23:26 13.0 30.0
2017-09-03 17:23:38 19.0 30.0
2017-09-03 17:23:49 13.0 30.0
2017-09-04 17:24:00 15.0 30.0
答案 1 :(得分:1)
我认为,缺少日期2017-09-02
df['group']=pd.to_datetime(df.index)
df['group']=df['group'].dt.date
df['meanval']=df.groupby('group').x1.transform('mean')
id1=pd.Series(pd.date_range(df.group.min(),df.group.max(),freq='D')).dt.date.to_frame(name ='group')
idx=pd.concat([df,id1[~id1.group.isin(df.group)]],axis=0).sort_values('group').fillna(0)
idx=idx.drop_duplicates(['group']).rolling(2).sum().fillna(9).set_index('group')
df.meanval=df.group.map(idx.meanval)
df
Out[680]:
x1 group meanval
ts
2017-09-01 17:22:42 7 2017-09-01 9.0
2017-09-01 17:22:53 11 2017-09-01 9.0
2017-09-01 17:23:04 9 2017-09-01 9.0
2017-09-03 17:23:26 13 2017-09-03 15.0
2017-09-03 17:23:38 19 2017-09-03 15.0
2017-09-03 17:23:49 13 2017-09-03 15.0
2017-09-04 17:24:00 15 2017-09-04 30.0
数据输入:
df
Out[682]:
x1
ts
2017-09-01 17:22:42 7
2017-09-01 17:22:53 11
2017-09-01 17:23:04 9
2017-09-03 17:23:26 13
2017-09-03 17:23:38 19
2017-09-03 17:23:49 13
2017-09-04 17:24:00 15