我有一个如下所示的数据框:
Grouped Week Revenue Users Period CSum
2013-10-14 2013-10-14 2863.75 36 1 2863.75
2013-10-21 202.20 4 2 3065.95
2013-10-28 603.45 8 3 3669.40
2013-11-04 535.65 9 4 4205.05
2013-11-11 424.45 14 5 4629.50
2015-06-01 2015-06-01 24115.91 468 1 24115.91
2015-06-08 1634.93 32 2 25750.84
2015-06-15 2664.00 62 3 28414.84
2015-06-22 1646.05 40 4 30060.89
试着弄清楚Pandas如何基本上只为每个Grouped提供第4个时期,以便我得到:
Grouped Week Revenue Users Period CSum
2013-10-14 2013-11-04 535.65 9 4 4205.05
2015-06-01 2015-06-22 1646.05 40 4 30060.89
这样做的最佳方法是什么?
答案 0 :(得分:1)
boolean indexing
的解决方案:
df = df[df['Period'] == 4]
print (df)
Revenue Users Period CSum
Grouped Week
2013-10-14 2013-11-04 535.65 9 4 4205.05
2015-06-01 2015-06-22 1646.05 40 4 30060.89
cumcount
的另一个解决方案,如果需要在4.th
的第一级选择每个MultiIndex
行,并且无法使用第一个解决方案:
ser = df.groupby(level=0).cumcount()
print (ser)
Grouped Week
2013-10-14 2013-10-14 0
2013-10-21 1
2013-10-28 2
2013-11-04 3
2013-11-11 4
2015-06-01 2015-06-01 0
2015-06-08 1
2015-06-15 2
2015-06-22 3
dtype: int64
print (ser == 3)
Grouped Week
2013-10-14 2013-10-14 False
2013-10-21 False
2013-10-28 False
2013-11-04 True
2013-11-11 False
2015-06-01 2015-06-01 False
2015-06-08 False
2015-06-15 False
2015-06-22 True
dtype: bool
print (df[ser == 3])
Revenue Users Period CSum
Grouped Week
2013-10-14 2013-11-04 535.65 9 4 4205.05
2015-06-01 2015-06-22 1646.05 40 4 30060.89