答案 0 :(得分:0)
一种选择是为连续四个月的每组创建一个具有唯一值的新列。 然后对此列和值进行排序,以便后一个降序 然后只连续4个月在具有唯一值的列上进行分组 然后只取该组的第一个元素
from pandas import DataFrame
from datetime import datetime,date
#create a dummy dataframe
values = []
for mo in range(1,13):
row = {}
row["date"]=date(2018,mo,1)
row["val"]=5+0.1*mo
values.append(row)
df = DataFrame(values)
#create a column which can be ordered and is unique for
#every group of successive 4 month
#if more than one year is to be considered the lambda function
#should cover this as well
df["quadmester"]=df["date"].apply(lambda x: (x.month-1)/4)
#sort by the added column and then by value
#since the date for which top value in 4 consecutive month
#is considered sort descending on values
df = df.sort_values(by=["quadmester","val"],ascending=[1,0])
#show the dataframe
print df
产生
date val quadmester
3 2018-04-01 5.4 0
2 2018-03-01 5.3 0
1 2018-02-01 5.2 0
0 2018-01-01 5.1 0
7 2018-08-01 5.8 1 6 2018-07-01 5.7 1
5 2018-06-01 5.6 1
4 2018-05-01 5.5 1
11 2018-12-01 6.2 2
10 2018-11-01 6.1 2 9 2018-10-01 6.0 2
8 2018-09-01 5.9 2
| 0 | 3 | 2018-04-01 | 5.4 | 0 |
| 1 | 7 | 2018-08-01 | 5.8 | 1 |
| 2 | 11 | 2018-12-01 | 6.2 | 2 |
答案 1 :(得分:0)
您可以按日期应用4个月内的 groupby 模和 agg 以及 max 和 idxmax 来获得相应的最大值及其连续四个月的日期。
这是一个假设您的示例中日期为索引的单行解决方案:
df.groupby(lambda x:(x.year*10 + x.month%4)).agg(['max','idxmax'])
答案 2 :(得分:0)
import numpy as np
from numpy.random import randn as rn
DatetimeIndex = pd.date_range('1/1/2015', '12/31/2015', freq='B')
np.random.seed(101)
df = pd.DataFrame(rn(len(DatetimeIndex)),index=DatetimeIndex)
df.groupby(pd.Grouper(freq='4MS',label='right')).max()[:3]
2015-05-01 2.706850 2015-09-01 2.302987 2016-01-01 2.493990