我有以下两个数据框:
* def response =
"""
[
"BP Part Sht NCA MS",
"BP Part Sht NCA MS",
"BP Part Sht NCA MS",
"BP Part Sht NCA MS",
"BP Part Sht NCA MS",
"Bay Pond USB, Inc MS",
"Bay Pond USB, Inc MS",
"BP USB III Inc MS",
"BP USB III Inc MS",
"BP USB III Inc MS",
"BP USB III Inc MS",
"BP CS Sht NCA",
"BP CS Sht NCA",
"BP CS Sht NCA",
"BP CS Sht NCA",
"BP USB IV, Inc MS",
"BP Mrts Block NCA MS",
"BP Mrts Block NCA MS"
]
"""
* json response = new java.util.HashSet(response)
* def expected =
"""
[
"BP Part Sht NCA MS",
"Bay Pond USB, Inc MS",
"BP USB III Inc MS",
"BP CS Sht NCA",
"BP USB IV, Inc MS",
"BP Mrts Block NCA MS",
]
"""
* match response contains only expected
我想从df = pd.DataFrame({
'id': ['1', '1', '2', '3', '3', '8','4', '1', '2', '4'],
'start': ['2017-01-01', '2017-02-01', '2017-03-01', '2017-02-01', '2017-03-01', '2017-04-01', '2017-01-01', '2017-04-01', '2017-05-01', '2017-02-01'],
'end': ['2017-01-02', '2017-02-4', '2017-03-02', '2017-02-06', '2017-03-01', '2017-04-03', '2017-01-06', '2017-04-08', '2017-05-04', '2017-02-01']
})
df1 = pd.DataFrame({
'date': ['2017-01-02', '2017-02-01', '2017-03-01', '2017-02-01', '2017-03-01', '2017-04-01'],
'id': ['1', '2', '3','4', '5', '6']
})
中仅提取df
中的id
与df
中的id
和df1
中的{该特定date
的{{1}}也与df1
中的id
和start
相匹配或在其之间。
通过比较第二个数据帧end
中是否存在相同的df
,我可以轻松地从id
中提取df
:
id
但是我无法比较df1
的{{1}}与df_filtered = df[(df['id'].isin(df1['id']))]
的{{1}}和date
。我想要的输出如下:
df1
date,start和end列已采用日期时间格式Y-M-D。任何帮助将不胜感激。
答案 0 :(得分:1)
您可能想merge
df.merge(df1, on='id', how='inner')
end id start date
0 2017-01-02 1 2017-01-01 2017-01-02
1 2017-02-4 1 2017-02-01 2017-01-02
2 2017-04-08 1 2017-04-01 2017-01-02
3 2017-03-02 2 2017-03-01 2017-02-01
4 2017-05-04 2 2017-05-01 2017-02-01
5 2017-02-06 3 2017-02-01 2017-03-01
6 2017-03-01 3 2017-03-01 2017-03-01
7 2017-01-06 4 2017-01-01 2017-02-01
8 2017-02-01 4 2017-02-01 2017-02-01
然后比较列
答案 1 :(得分:1)
合并和过滤:
df2 = df.merge(df1)
df2[(df2['date']>=df2['start'])&(df2['date']<=df2['end'])]