我有一个数据框;
index UoW Category Description Date Channel Trans
ADATE
2018-12-31 1603 Pay Infringement 31/12/2018 AustPost 209
2018-12-31 1604 Pay Infringement 31/12/2018 AustPost 14
2019-12-31 1605 Pay Infringement 31/12/2018 CSC 234
2019-12-31 1606 Pay Infringement 31/12/2018 CSC 1
2019-12-31 1607 Pay Infringement 31/12/2018 DTMR Other 1
2018-12-31 1608 Pay Infringement 31/12/2018 Internet 496
2018-12-30 1609 Pay Infringement 30/12/2018 CSC 266
我想在按“渠道”和“年份”分组之后添加列df['MonthofYear']
。
以下内容为我提供了所需的结果,而无需添加其他列
df['Trans'].groupby([df['Channel'], df.index.year]).agg(['max', 'min'])
我尝试过:
df['MonthofYear']=df['Trans'].groupby([df['Channel'], df.index.year]).agg(['max', 'min']).transform(df.index.month)
希望获得帮助
答案 0 :(得分:2)
使用DataFrameGroupBy.idxmax
并
DataFrameGroupBy.idxmin
表示从索引开始的日期时间,按Trans
列的最大值和最小值,然后将值转换为月份:
tup = [('MaxVal','max'),
('MinVal', 'min'),
('MonthofYearMin', 'idxmin'),
('MonthofYearMax', 'idxmax')]
df1 = df.groupby(['Channel', df.index.year.rename('year')])['Trans'].agg(tup)
df1['MonthofYearMax'] = df1['MonthofYearMax'].dt.month
df1['MonthofYearMin'] = df1['MonthofYearMin'].dt.month
print (df1)