使用pandas计算给定频率的数据帧上的平均值

时间:2017-06-14 00:25:29

标签: python pandas dataframe time-series grouping

我有一个数据框,其中每30分钟测量一次值,如下所示:

2015-01-01 00:00:00   94.50
2015-01-01 00:30:00   78.75
2015-01-01 01:00:00   85.87
2015-01-01 01:30:00   85.88
2015-01-01 02:00:00   84.75
2015-01-01 02:30:00   87.50

所以,每天有48个值。第一列是使用:

创建的时间索引
date= pd.date_range( '1/1/2015', periods=len(series),freq='30min' )
series=series.values.reshape(-1,1)
df=pd.DataFrame(series, index=date)

我想要的是获取每天和工作日的每个时间的平均值。像这样的东西: numbers 1-7 are the weekday

我最初的想法是按工作日和频率分组(30分钟),如下所示:

df= df.groupby([ df.index.weekday,df.index.freq])
print(df.describe())


       count    mean std     min     25%     50%     75%   
0 2015-01-05 00:30:00   1.0   93.75 NaN   93.75   93.75   93.75   93.75   
  2015-01-05 01:00:00   1.0  110.25 NaN  110.25  110.25  110.25  110.25   
  2015-01-05 01:30:00   1.0  110.88 NaN  110.88  110.88  110.88  110.88   
  2015-01-05 02:00:00   1.0   90.12 NaN   90.12   90.12   90.12   90.12   
  2015-01-05 02:30:00   1.0   91.50 NaN   91.50   91.50   91.50   91.50   
  2015-01-05 03:00:00   1.0   94.13 NaN   94.13   94.13   94.13   94.13   
  2015-01-05 03:30:00   1.0   90.62 NaN   90.62   90.62   90.62   90.62   
  2015-01-05 04:00:00   1.0   91.88 NaN   91.88   91.88   91.88   91.88   
  2015-01-05 04:30:00   1.0   92.50 NaN   92.50   92.50   92.50   92.50   
  2015-01-05 05:00:00   1.0   98.12 NaN   98.12   98.12   98.12   98.12   
  2015-01-05 05:30:00   1.0  105.75 NaN  105.75  105.75  105.75  105.75   
  2015-01-05 06:00:00   1.0  100.50 NaN  100.50  100.50  100.50  100.50   
  2015-01-05 06:30:00   1.0   82.25 NaN   82.25   82.25   82.25   82.25   
  2015-01-05 07:00:00   1.0   81.75 NaN   81.75   81.75   81.75   81.75   
  2015-01-05 07:30:00   1.0   90.50 NaN   90.50   90.50   90.50   90.50   
  2015-01-05 08:00:00   1.0   89.50 NaN   89.50   89.50   89.50   89.50   
  2015-01-05 08:30:00   1.0   89.63 NaN   89.63   89.63   89.63   89.63   
  2015-01-05 09:00:00   1.0   84.62 NaN   84.62   84.62   84.62   84.62   
  2015-01-05 09:30:00   1.0   86.63 NaN   86.63   86.63   86.63   86.63   
  2015-01-05 10:00:00   1.0   96.12 NaN   96.12   96.12   96.12   96.12   
  2015-01-05 10:30:00   1.0  104.13 NaN  104.13  104.13  104.13  104.13   
  2015-01-05 11:00:00   1.0  101.12 NaN  101.12  101.12  101.12  101.12   
  2015-01-05 11:30:00   1.0   85.88 NaN   85.88   85.88   85.88   85.88   
  2015-01-05 12:00:00   1.0   77.12 NaN   77.12   77.12   77.12   77.12   
  2015-01-05 12:30:00   1.0   78.88 NaN   78.88   78.88   78.88   78.88   
  2015-01-05 13:00:00   1.0   76.62 NaN   76.62   76.62   76.62   76.62   
  2015-01-05 13:30:00   1.0   78.63 NaN   78.63   78.63   78.63   78.63   
  2015-01-05 14:00:00   1.0   85.37 NaN   85.37   85.37   85.37   85.37   
  2015-01-05 14:30:00   1.0  103.63 NaN  103.63  103.63  103.63  103.63   
  2015-01-05 15:00:00   1.0  112.87 NaN  112.87  112.87  112.87  112.87   
...                     ...     ...  ..     ...     ...     ...     ...   
6 2016-10-02 09:30:00   1.0   84.75 NaN   84.75   84.75   84.75   84.75   
  2016-10-02 10:00:00   1.0   60.49 NaN   60.49   60.49   60.49   60.49   
  2016-10-02 10:30:00   1.0   76.25 NaN   76.25   76.25   76.25   76.25   
  2016-10-02 11:00:00   1.0   68.13 NaN   68.13   68.13   68.13   68.13   
  2016-10-02 11:30:00   1.0   54.15 NaN   54.15   54.15   54.15   54.15   
  2016-10-02 12:00:00   1.0   79.91 NaN   79.91   79.91   79.91   79.91   
  2016-10-02 12:30:00   1.0   72.79 NaN   72.79   72.79   72.79   72.79   
  2016-10-02 13:00:00   1.0   77.49 NaN   77.49   77.49   77.49   77.49   
  2016-10-02 13:30:00   1.0   77.65 NaN   77.65   77.65   77.65   77.65   
  2016-10-02 14:00:00   1.0   70.44 NaN   70.44   70.44   70.44   70.44   
  2016-10-02 14:30:00   1.0   82.47 NaN   82.47   82.47   82.47   82.47   
  2016-10-02 15:00:00   1.0   41.53 NaN   41.53   41.53   41.53   41.53   
  2016-10-02 15:30:00   1.0   66.65 NaN   66.65   66.65   66.65   66.65   
  2016-10-02 16:00:00   1.0   55.23 NaN   55.23   55.23   55.23   55.23   
  2016-10-02 16:30:00   1.0   59.45 NaN   59.45   59.45   59.45   59.45   
  2016-10-02 17:00:00   1.0   79.92 NaN   79.92   79.92   79.92   79.92   
  2016-10-02 17:30:00   1.0   58.48 NaN   58.48   58.48   58.48   58.48   
  2016-10-02 18:00:00   1.0   92.56 NaN   92.56   92.56   92.56   92.56   
  2016-10-02 18:30:00   1.0   86.92 NaN   86.92   86.92   86.92   86.92   
  2016-10-02 19:00:00   1.0   88.61 NaN   88.61   88.61   88.61   88.61   
  2016-10-02 19:30:00   1.0   99.21 NaN   99.21   99.21   99.21   99.21   
  2016-10-02 20:00:00   1.0   81.02 NaN   81.02   81.02   81.02   81.02   
  2016-10-02 20:30:00   1.0   84.83 NaN   84.83   84.83   84.83   84.83   
  2016-10-02 21:00:00   1.0   59.29 NaN   59.29   59.29   59.29   59.29   
  2016-10-02 21:30:00   1.0   95.99 NaN   95.99   95.99   95.99   95.99   
  2016-10-02 22:00:00   1.0   76.95 NaN   76.95   76.95   76.95   76.95   
  2016-10-02 22:30:00   1.0  112.49 NaN  112.49  112.49  112.49  112.49   
  2016-10-02 23:00:00   1.0   88.85 NaN   88.85   88.85   88.85   88.85   
  2016-10-02 23:30:00   1.0  122.40 NaN  122.40  122.40  122.40  122.40   
  2016-10-03 00:00:00   1.0   82.84 NaN   82.84   82.84   82.84   82.84 

通过观察这个,你可以看到它只是按工作日分组。因此,这不是正确的分组方式,以便按照我的意愿计算平均值。

1 个答案:

答案 0 :(得分:3)

我使用df.index.weekdaydf.index.time

df.groupby([ df.index.weekday,df.index.time]).mean()