我有以下数据框:
Date group File1 File2 Begin Date End Date
4/28/2014 A CC2015H CC2015K 5/1/2014 2/2/2015
4/29/2014 A CC2015H CC2015K 5/1/2014 2/2/2015
4/30/2014 A CC2015H CC2015K 5/1/2014 2/2/2015
5/1/2014 A CC2015H CC2015K 5/1/2014 2/2/2015
5/2/2014 A CC2015H CC2015K 5/1/2014 2/2/2015
1/22/2015 A CC2015H CC2015K 5/1/2014 2/2/2015
1/23/2015 A CC2015H CC2015K 5/1/2014 2/2/2015
1/26/2015 A CC2015H CC2015K 5/1/2014 2/2/2015
1/27/2015 A CC2015H CC2015K 5/1/2014 2/2/2015
1/28/2015 A CC2015H CC2015K 5/1/2014 2/2/2015
1/29/2015 A CC2015H CC2015K 5/1/2014 2/2/2015
1/30/2015 A CC2015H CC2015K 5/1/2014 2/2/2015
2/2/2015 A CC2015H CC2015K 5/1/2014 2/2/2015
2/3/2015 A CC2015H CC2015K 5/1/2014 2/2/2015
2/4/2015 A CC2015H CC2015K 5/1/2014 2/2/2015
2/5/2015 A CC2015H CC2015K 5/1/2014 2/2/2015
2/6/2015 A CC2015H CC2015K 5/1/2014 2/2/2015
8/25/2014 B ZC2015U ZC2015Z 8/29/2014 8/14/2015
8/26/2014 B ZC2015U ZC2015Z 8/29/2014 8/14/2015
8/27/2014 B ZC2015U ZC2015Z 8/29/2014 8/14/2015
8/28/2014 B ZC2015U ZC2015Z 8/29/2014 8/14/2015
8/29/2014 B ZC2015U ZC2015Z 8/29/2014 8/14/2015
9/2/2014 B ZC2015U ZC2015Z 8/29/2014 8/14/2015
8/7/2015 B ZC2015U ZC2015Z 8/29/2014 8/14/2015
8/10/2015 B ZC2015U ZC2015Z 8/29/2014 8/14/2015
8/11/2015 B ZC2015U ZC2015Z 8/29/2014 8/14/2015
8/12/2015 B ZC2015U ZC2015Z 8/29/2014 8/14/2015
8/13/2015 B ZC2015U ZC2015Z 8/29/2014 8/14/2015
8/14/2015 B ZC2015U ZC2015Z 8/29/2014 8/14/2015
8/17/2015 B ZC2015U ZC2015Z 8/29/2014 8/14/2015
8/18/2015 B ZC2015U ZC2015Z 8/29/2014 8/14/2015
8/19/2015 B ZC2015U ZC2015Z 8/29/2014 8/14/2015
8/20/2015 B ZC2015U ZC2015Z 8/29/2014 8/14/2015
它实际上是一个更大的数据框,带有更多的组。我出于显示目的将其缩短。 我正在尝试按以下方式过滤日期列上的数据框:
df = df.loc[df.groupby(['group','File1', 'File2']).df['Date'] >= df.groupby(['group', 'File1', 'File2'])['Begin Date']
输出应如下:
Date group File1 File2 Begin Date End Date
5/1/2014 A CC2015H CC2015K 5/1/2014 2/2/2015
5/2/2014 A CC2015H CC2015K 5/1/2014 2/2/2015
1/22/2015 A CC2015H CC2015K 5/1/2014 2/2/2015
1/23/2015 A CC2015H CC2015K 5/1/2014 2/2/2015
1/26/2015 A CC2015H CC2015K 5/1/2014 2/2/2015
1/27/2015 A CC2015H CC2015K 5/1/2014 2/2/2015
1/28/2015 A CC2015H CC2015K 5/1/2014 2/2/2015
1/29/2015 A CC2015H CC2015K 5/1/2014 2/2/2015
1/30/2015 A CC2015H CC2015K 5/1/2014 2/2/2015
2/2/2015 A CC2015H CC2015K 5/1/2014 2/2/2015
2/3/2015 A CC2015H CC2015K 5/1/2014 2/2/2015
2/4/2015 A CC2015H CC2015K 5/1/2014 2/2/2015
2/5/2015 A CC2015H CC2015K 5/1/2014 2/2/2015
2/6/2015 A CC2015H CC2015K 5/1/2014 2/2/2015
8/29/2014 B ZC2015U ZC2015Z 8/29/2014 8/14/2015
9/2/2014 B ZC2015U ZC2015Z 8/29/2014 8/14/2015
8/7/2015 B ZC2015U ZC2015Z 8/29/2014 8/14/2015
8/10/2015 B ZC2015U ZC2015Z 8/29/2014 8/14/2015
8/11/2015 B ZC2015U ZC2015Z 8/29/2014 8/14/2015
8/12/2015 B ZC2015U ZC2015Z 8/29/2014 8/14/2015
8/13/2015 B ZC2015U ZC2015Z 8/29/2014 8/14/2015
8/14/2015 B ZC2015U ZC2015Z 8/29/2014 8/14/2015
8/17/2015 B ZC2015U ZC2015Z 8/29/2014 8/14/2015
8/18/2015 B ZC2015U ZC2015Z 8/29/2014 8/14/2015
8/19/2015 B ZC2015U ZC2015Z 8/29/2014 8/14/2015
8/20/2015 B ZC2015U ZC2015Z 8/29/2014 8/14/2015
奖金问题:我想按开始日期和结束日期过滤,即按条件保留组
df['Date'] >= df['Begin Date'] & df['Date'] <= df['End Date']
感谢您的任何帮助或建议。
答案 0 :(得分:0)
我认为这里不需要groupby
,因为您没有在每个组中汇总任何东西(最小,最大,总和,计数等)。
between
是您要寻找的:
df[df['Date'].between(df['Begin Date'], df['End Date'])]