每小时平均7天与熊猫

时间:2020-07-15 05:51:58

标签: pandas mean rolling-computation

我需要一些帮助来计算每小时7天的平均值。

时间序列有一个小时的分辨率,我需要每个小时的7天平均值,例如十三点钟

date, x
2020-07-01 13:00 , 4
2020-07-01 14:00 , 3
.
.
.
2020-07-02 13:00 , 3
2020-07-02 14:00 , 7
.
.
.

我用熊猫和滚动平均值进行了尝试,但是滚动包含了最近7天。 感谢您的提示!

1 个答案:

答案 0 :(得分:1)

添加新的hour列,按hour列分组,然后添加 在7天内计算平均值。这符合问题的意图。

df['hour'] = df.index.hour
df = df.groupby(df.hour)['x'].rolling(7).mean().reset_index()
df.head(35)

   hour level_1 x
0   0   2020-07-01 00:00:00 NaN
1   0   2020-07-02 00:00:00 NaN
2   0   2020-07-03 00:00:00 NaN
3   0   2020-07-04 00:00:00 NaN
4   0   2020-07-05 00:00:00 NaN
5   0   2020-07-06 00:00:00 NaN
6   0   2020-07-07 00:00:00 48.142857
7   0   2020-07-08 00:00:00 50.285714
8   0   2020-07-09 00:00:00 60.000000
9   0   2020-07-10 00:00:00 63.142857
10  1   2020-07-01 01:00:00 NaN
11  1   2020-07-02 01:00:00 NaN
12  1   2020-07-03 01:00:00 NaN
13  1   2020-07-04 01:00:00 NaN
14  1   2020-07-05 01:00:00 NaN
15  1   2020-07-06 01:00:00 NaN
16  1   2020-07-07 01:00:00 52.571429
17  1   2020-07-08 01:00:00 48.428571
18  1   2020-07-09 01:00:00 38.000000
19  2   2020-07-01 02:00:00 NaN
20  2   2020-07-02 02:00:00 NaN
21  2   2020-07-03 02:00:00 NaN
22  2   2020-07-04 02:00:00 NaN
23  2   2020-07-05 02:00:00 NaN
24  2   2020-07-06 02:00:00 NaN
25  2   2020-07-07 02:00:00 46.571429
26  2   2020-07-08 02:00:00 47.714286
27  2   2020-07-09 02:00:00 42.714286
28  3   2020-07-01 03:00:00 NaN
29  3   2020-07-02 03:00:00 NaN
30  3   2020-07-03 03:00:00 NaN
31  3   2020-07-04 03:00:00 NaN
32  3   2020-07-05 03:00:00 NaN
33  3   2020-07-06 03:00:00 NaN
34  3   2020-07-07 03:00:00 72.571429