按分钟分组索引和计算平均值

时间:2016-10-10 06:58:05

标签: python pandas average minute pandas-groupby

所以我有一个名为'df'的pandas数据框,我想删除秒,只需要YYYY-MM-DD HH:MM格式的索引。但是,分钟也会被分组,并显示该分钟的平均值。

所以我想转换这个dataFrame

                        value
2015-05-03 00:00:00     61.0
2015-05-03 00:00:10     60.0
2015-05-03 00:00:25     60.0
2015-05-03 00:00:30     61.0
2015-05-03 00:00:45     61.0
2015-05-03 00:01:00     61.0
2015-05-03 00:01:10     60.0
2015-05-03 00:01:25     60.0
2015-05-03 00:01:30     61.0
2015-05-03 00:01:45     61.0
2015-05-03 00:02:00     61.0
2015-05-03 00:02:10     60.0
2015-05-03 00:02:25     60.0
2015-05-03 00:02:40     60.0
2015-05-03 00:02:55     60.0
2015-05-03 00:03:00     59.0
2015-05-03 00:03:15     59.0
2015-05-03 00:03:20     59.0
2015-05-03 00:03:35     59.0
2015-05-03 00:03:40     60.0

进入此dataFrame

                        value
2015-05-03 00:00        60.6
2015-05-03 00:01        60.6
2015-05-03 00:02        60.2
2015-05-03 00:03        59.2

我试过像

这样的代码
df['value'].resample('1Min').mean()

df.index.resample('1Min').mean()

但这似乎不起作用。有什么想法吗?

1 个答案:

答案 0 :(得分:4)

您需要先将索引转换为DatetimeIndex

df.index = pd.DatetimeIndex(df.index)
#another solution
#df.index = pd.to_datetime(df.index)

print (df['value'].resample('1Min').mean())
#another same solution
#print (df.resample('1Min')['value'].mean())
2015-05-03 00:00:00    60.6
2015-05-03 00:01:00    60.6
2015-05-03 00:02:00    60.2
2015-05-03 00:03:00    59.2
Freq: T, Name: value, dtype: float64

另一种解决方案,将0的索引中的秒数设置为astype

print (df.groupby([df.index.values.astype('<M8[m]')])['value'].mean())
2015-05-03 00:00:00    60.6
2015-05-03 00:01:00    60.6
2015-05-03 00:02:00    60.2
2015-05-03 00:03:00    59.2
Name: value, dtype: float64