获取具有相应索引值的每日数据帧的每月最大值

时间:2016-03-02 17:54:02

标签: python pandas group-by max time-series

我从yahoo finance下载了每日数据

                    Open          High           Low         Close     Volume  \
Date                                                                            
2016-01-04  10485.809570  10485.910156  10248.580078  10283.440430  116249000   
2016-01-05  10373.269531  10384.259766  10173.519531  10310.099609   82348000   
2016-01-06  10288.679688  10288.679688  10094.179688  10214.019531   87751700   
2016-01-07  10144.169922  10145.469727   9810.469727   9979.849609  124188100   
2016-01-08  10010.469727  10122.459961   9849.339844   9849.339844   95672200   
...
2016-02-23   9503.120117   9535.120117   9405.219727   9416.769531   87240700   
2016-02-24   9396.480469   9415.330078   9125.190430   9167.799805   99216000   
2016-02-25   9277.019531   9391.309570   9199.089844   9331.480469          0   
2016-02-26   9454.519531   9576.879883   9436.330078   9513.299805   95662100   
2016-02-29   9424.929688   9498.570312   9332.419922   9495.400391   90978700   

我想找到每个月的最高收盘价以及收盘价的日期。

使用groupby dfM = df['Close'].groupby(df.index.month).max(),它会返回每月最高值,但我会丢失每日索引位置。

   grouped by month 
1      10310.099609
2       9757.879883

有保持索引的好方法吗?

我会寻找这样的结果:

            grouped by month 
2016-01-05      10310.099609
2016-02-01       9757.879883

1 个答案:

答案 0 :(得分:8)

您可以使用TimeGroupergroupby一起获得每月的最高值:

from pandas.io.data import DataReader

aapl = DataReader('AAPL', data_source='yahoo', start='2015-6-1')
>>> aapl.groupby(pd.TimeGrouper('M')).Close.max()
Date
2015-06-30    130.539993
2015-07-31    132.070007
2015-08-31    119.720001
2015-09-30    116.410004
2015-10-31    120.529999
2015-11-30    122.570000
2015-12-31    119.029999
2016-01-31    105.349998
2016-02-29     98.120003
2016-03-31    100.529999
Freq: M, Name: Close, dtype: float64

使用idxmax将获得最高价格的相应日期。

>>> aapl.groupby(pd.TimeGrouper('M')).Close.idxmax()
Date
2015-06-30   2015-06-01
2015-07-31   2015-07-20
2015-08-31   2015-08-10
2015-09-30   2015-09-16
2015-10-31   2015-10-29
2015-11-30   2015-11-03
2015-12-31   2015-12-04
2016-01-31   2016-01-04
2016-02-29   2016-02-17
2016-03-31   2016-03-01
Name: Close, dtype: datetime64[ns]

并排获得结果:

>>> aapl.groupby(pd.TimeGrouper('M')).Close.agg({'max date': 'idxmax', 'max price': np.max})
             max price   max date
Date                             
2015-06-30  130.539993 2015-06-01
2015-07-31  132.070007 2015-07-20
2015-08-31  119.720001 2015-08-10
2015-09-30  116.410004 2015-09-16
2015-10-31  120.529999 2015-10-29
2015-11-30  122.570000 2015-11-03
2015-12-31  119.029999 2015-12-04
2016-01-31  105.349998 2016-01-04
2016-02-29   98.120003 2016-02-17
2016-03-31  100.529999 2016-03-01