每12小时计算所有列的平均值

时间:2018-09-19 07:18:51

标签: python pandas dataframe mean

我有一个数据框,我想计算所有列的每12小时平均值。 数据框具有超过20万行。

          DateTime Speed   TRQ         ...    PtoP3  RMS3   Crest3
0       2016-07-01 00:00   994  35.4   ...       NA    NA       NA
1       2016-07-01 00:01   995  34.6   ...       NA    NA       NA
2       2016-07-01 00:02   995    34   ...       NA    NA       NA

我写了这个

Present_data.to_datetime(Present_data['DateTime'])
Total_12hravg_all = Present_data.groupby(pd.Grouper(freq='12H', key='DateTime')).mean()
print(Total_12hravg_all)

并收到此错误

  

TypeError:仅与DatetimeIndex,TimedeltaIndex或   PeriodIndex,但有一个“索引”的实例

1 个答案:

答案 0 :(得分:1)

如果Datetime是列

您的解决方案应该运作良好:

Present_data['DateTime'] = pd.to_datetime(Present_data['DateTime'])
Total_12hravg_all = Present_data.groupby(pd.Grouper(freq='12H', key='DateTime')).mean()

另一种解决方案是将resample与参数on一起使用:

Present_data['DateTime'] = pd.to_datetime(Present_data['DateTime'])
Total_12hravg_all = Present_data.resample('12H', on='DateTime').mean()

或创建DatetimeIndex

Present_data['DateTime'] = pd.to_datetime(Present_data['DateTime'])
Present_data = Present_data.set_index('DateTime')
Total_12hravg_all = Present_data.groupby(pd.Grouper(freq='12H')).mean()
#resample
#Total_12hravg_all = Present_data.resample('12H').mean()

如果Datetime是索引

Present_data.index = pd.to_datetime(Present_data.index)

Total_12hravg_all = Present_data.groupby(pd.Grouper(freq='12H')).mean()
#resample
#Total_12hravg_all = Present_data.resample('12H').mean()

最终解决方案:

Present_data['DateTime'] = pd.to_datetime(Present_data['DateTime'])
Present_data = Present_data.set_index('DateTime')

#convert non numeri values to NaNs
Present_data = Present_data.apply(lambda x: pd.to_numeric(x, errors='coerce'))

Total_12hravg_all = Present_data.groupby(pd.Grouper(freq='12H')).mean()
#resample
#Total_12hravg_all = Present_data.resample('12H').mean()