Question

假设我有以下两个数据框：

df = pd.DataFrame({'c': ['abc', 'def', 'wyx', 'abc', 'wyx'], 'begin_date': ['2020-01-01', '2000-12-23', '2003-07-07', '2005-03-02', '2009-02-01'], 'end_date': ['2020-01-31', '2001-02-02', '2004-03-02', '2005-04-01', '2010-07-04']})

c  begin_date    end_date
abc  2020-01-01  2020-01-31
def  2000-12-23  2001-02-02
wyx  2003-07-07  2004-03-02
abc  2005-03-02  2005-04-01
wyx  2009-02-01  2010-07-04

df1 = pd.DataFrame({'id': np.arange(5), 'c': ['wyx', 'abc', 'abc', 'def', 'qwe'], 'date': ['2003-12-12', '2020-02-02', '2005-03-15', '2002-11-05', '2005-01-01']})

id    c        date
0  wyx  2003-12-12
1  abc  2020-02-02
2  abc  2005-03-15
3  def  2002-11-05
4  qwe  2005-01-01

我想找出df1中哪些ID的日期基于c在df中的范围内。

我的最终数据帧将是这样：

id    c        date  in_range
0  wyx  2003-12-12      True
1  abc  2020-02-02     False
2  abc  2005-03-15      True
3  def  2002-11-05     False

我设法用以下代码做到了：

x = df1.merge(df, on='c')
x['in_range'] = x['date'].ge(x['begin_date'])&x['date'].le(x['end_date'])
x.groupby(['id', 'c', 'date'])['in_range'].sum().reset_index()

但是我认为可能会有更简单的方法，并且df1中还有更多的列，这会使groupby非常大。

检查日期是否在熊猫合并中的范围内

0 个答案: