我正在尝试根据几列匹配2个数据帧。在此之后,我想从原始行中删除匹配的行,但无法获得所需的行。我要这样做的原因是,如果最后一场失败了,我将尝试几次比赛。
这是我的尝试:
import pandas as pd
# Creating the first dataframe
d1 = {'col1': [1, 2], 'col2': [3, 4], 'col3': [5,6]}
df1 = pd.DataFrame(data=d1)
# Creating the second dataframe
d2 = {'col1': [1, 3], 'col2': [3, 4], 'col3': [5,6]}
df2 = pd.DataFrame(data=d2)
这是df1给我的:
col1 col2 col3
0 1 3 5
1 2 4 6
这对于df2:
col1 col2 col3
0 1 3 5
1 3 4 6
加入步骤:
# Inner join to see the matches
fields = ['col1', 'col2']
dff = df1.merge(df2, how='inner', on=fields)
# Remove from df1 and df2 the matches ones
dfs1 = df1[~df1[fields].isin(dff)]
dfs2 = df2[~df2[fields].isin(dff)]
例如,这是我为dfs1获得的结果:
col1 col2 col3
0 NaN NaN NaN
1 2.0 4.0 NaN
这是我期望的结果:
col1 col2 col3
0 2 4 6
有什么想法吗? :)
谢谢!
答案 0 :(得分:2)
new_df = df1[(~df1.isin(df2)).any(1)]
print(new_df)
输出:
col1 col2 col3
1 2 4 6
答案 1 :(得分:1)
您可以通过以下方式直接与index
合作:
df1.iloc[df1.index.difference(dff.index), :]
哪个会给:
col1 col2 col3
1 2 4 6
答案 2 :(得分:0)
# Creating the dataframes
import pandas as pd
# Creating the first dataframe
d1 = {'col1': [1, 2], 'col2': [3, 4], 'col3': [5,6]}
df1 = pd.DataFrame(data=d1)
# Creating the second dataframe
d2 = {'col1': [1, 3], 'col2': [3, 4], 'col3': [5,6]}
df2 = pd.DataFrame(data=d2)
df1
col1 col2 col3
0 1 3 5
1 2 4 6
df2
col1 col2 col3
0 1 3 5
1 3 4 6
dff = df1.merge(df2,on=['col1','col2'])
dff
col1 col2 col3_x col3_y
0 1 3 5 5
dfs1 = df1[(~df1.col1.isin(dff.col1))&(~df1.col2.isin(dff.col2))]
dfs1
col1 col2 col3
1 2 4 6
dfs2 = df2[(~df2.col1.isin(dff.col1))&(~df2.col2.isin(dff.col2))]
dfs2
col1 col2 col3
1 3 4 6