考虑两个日期列(Pandas数据帧) YYYY - MM - DD
df1 = pd.DataFrame(data = {'col1' : ['2017-10-06','2017-11-15','2017-11-05','2018-10-06']})
df2 = pd.DataFrame(data = {'col1' : ['2017-10-06','2017-10-06','2018-12-05','2017-10-17','2019-10-06','2017-12-05','2017-3-30']})
DF1:
col1
0 2017-10-06
1 2017-11-15
2 2017-11-05
3 2018-10-06
名称:col1,dtype:object
DF2:
col1
0 2017-10-06
1 2017-10-06
2 2018-12-05
3 2017-10-17
4 2019-10-06
5 2017-12-05
6 2017-3-30
注意: 这是一个不平衡的列
现在我应该返回df1中每个日期值的计数,其中包含的日期数量多于df2 col1 date列数
输出:
df1
col1 count upcoming in df2 col1
0 2017-10-06 4
1 2017-11-15 3
2 2017-11-05 2
3 2018-10-06 2
答案 0 :(得分:1)
以下是来自numpy
df1['count']=(df1.col1.values[:,None]<df2.col1.values).sum(1)
df1
Out[423]:
col1 count
0 2017-10-06 4
1 2017-11-15 3
2 2017-11-05 3
3 2018-10-06 2