我的输入如下所示,我想创建一个新的列季度,应该为每个“名称”组重复,如预期的输出所示
number name date1 date 2
1750 AAR CORP 12/18/2015 5/31/2012
1750 AAR CORP 3/23/2016 5/31/2012
1750 AAR CORP 9/23/2016 5/31/2012
1750 AAR CORP 12/22/2016 5/31/2012
1800 ABBOTT LAB 5/8/2012 12/31/2011
1800 ABBOTT LAB 8/7/2012 12/31/2011
1800 ABBOTT LAB 11/7/2012 12/31/2011
1800 ABBOTT LAB 5/8/2013 12/31/2011
1800 ABBOTT LAB 8/6/2013 12/31/2011
预期产出:
number name date1 date 2 quarter
1750 AAR CORP 12/18/2015 5/31/2012 QTR 1
1750 AAR CORP 3/23/2016 5/31/2012 QTR 2
1750 AAR CORP 9/23/2016 5/31/2012 QTR 3
1750 AAR CORP 12/22/2016 5/31/2012 QTR 1
1800 ABBOTT LAB 5/8/2012 12/31/2011 QTR 1
1800 ABBOTT LAB 8/7/2012 12/31/2011 QTR 2
1800 ABBOTT LAB 11/7/2012 12/31/2011 QTR 3
1800 ABBOTT LAB 5/8/2013 12/31/2011 QTR 1
1800 ABBOTT LAB 8/6/2013 12/31/2011 QTR 2
四分之一值将以3的整数重复,直到该名称有行为止。
我被困在熊猫中的简单组合后,不知道如何继续每组。
答案 0 :(得分:3)
您可以在群组上使用cumcount
,然后重复1,2,3,..
序列,获取modulo
3
,add
1
< / p>
In [125]: 'QTR ' + ((df.groupby('name').cumcount() % 3) + 1).astype(str)
Out[125]:
0 QTR 1
1 QTR 2
2 QTR 3
3 QTR 1
4 QTR 1
5 QTR 2
6 QTR 3
7 QTR 1
8 QTR 2
dtype: object
或者,
In [142]: 'QTR ' + df.groupby('name').cumcount().mod(3).add(1).astype(str)
Out[142]:
0 QTR 1
1 QTR 2
2 QTR 3
3 QTR 1
4 QTR 1
5 QTR 2
6 QTR 3
7 QTR 1
8 QTR 2
dtype: object
详细
In [131]: df['quarter'] = 'QTR ' + ((df.groupby('name').cumcount() % 3) + 1).astype(str)
In [132]: df
Out[132]:
number name date1 date2 quarter
0 1750 AAR CORP 12/18/2015 5/31/2012 QTR 1
1 1750 AAR CORP 3/23/2016 5/31/2012 QTR 2
2 1750 AAR CORP 9/23/2016 5/31/2012 QTR 3
3 1750 AAR CORP 12/22/2016 5/31/2012 QTR 1
4 1800 ABBOTT LAB 5/8/2012 12/31/2011 QTR 1
5 1800 ABBOTT LAB 8/7/2012 12/31/2011 QTR 2
6 1800 ABBOTT LAB 11/7/2012 12/31/2011 QTR 3
7 1800 ABBOTT LAB 5/8/2013 12/31/2011 QTR 1
8 1800 ABBOTT LAB 8/6/2013 12/31/2011 QTR 2