Question

我有一个具有以下内容的Series对象：

df = 
    index              value
2014-05-23 07:00:00     0.67
2014-05-23 07:30:00     0.47
2014-05-23 08:00:00     0.42
2014-05-23 08:30:00     0.80
....

2017-07-10 22:00:00     0.42
2017-07-10 22:30:00     0.79
2017-07-10 23:00:00     0.84
2017-07-10 23:30:00     Nan

我想计算一年中的平均值，然后逐月计算，所以数据框看起来像这样，

df_new = 
  index                    value
   Jan      {0.11, 0.5, 0.3, 0.99, ... ,0.13} <-  time step of each value is 
   Feb      {...............................}     still 30 min, and each 
   Mar      {...............................}     value is average of same 
   Apr      {...............................}     time in the other year.  
   ....
   Dec      {...............................}

我有一些这样的数据帧，但是时间间隔不同（15分钟，60分钟......），有没有更好的自动计算？例如，像函数一样，它会自动知道索引的时间步长。提前谢谢！

Answer 1

我认为需要首先按resample进行上采样或下采样：

month

然后groupby mean DatetimeIndex.strftime转换为ordered Categorical和DatetimeIndex.time，汇总cats = ['Jan', 'Feb', 'Mar', 'Apr','May', 'Jun', 'Jul', 'Aug', 'Sep', 'Oct', 'Nov', 'Dec'] months = pd.Categorical(s.index.strftime('%b'), categories=cats, ordered=True) df = s.groupby([months, s.index.time]).mean().unstack()，最后重塑unstack }：

{{1}}

Pandas系列在python中按月索引（时间序列不同）排序

1 个答案: