Question

我正在尝试根据几列匹配2个数据帧。在此之后，我想从原始行中删除匹配的行，但无法获得所需的行。我要这样做的原因是，如果最后一场失败了，我将尝试几次比赛。

这是我的尝试：

import pandas as pd

# Creating the first dataframe
d1 = {'col1': [1, 2], 'col2': [3, 4], 'col3': [5,6]}  
df1 = pd.DataFrame(data=d1)    

# Creating the second dataframe
d2 = {'col1': [1, 3], 'col2': [3, 4], 'col3': [5,6]}    
df2 = pd.DataFrame(data=d2)

这是df1给我的：

   col1  col2  col3
0     1     3     5
1     2     4     6

这对于df2：

   col1  col2  col3
0     1     3     5
1     3     4     6

加入步骤：

# Inner join to see the matches
fields = ['col1', 'col2']  
dff = df1.merge(df2, how='inner', on=fields)   

# Remove from df1 and df2 the matches ones
dfs1 = df1[~df1[fields].isin(dff)]  
dfs2 = df2[~df2[fields].isin(dff)]

例如，这是我为dfs1获得的结果：

   col1  col2  col3
0   NaN   NaN   NaN
1   2.0   4.0   NaN

这是我期望的结果：

   col1  col2  col3
0     2     4     6

有什么想法吗？：）

谢谢！

Answer 1

使用pandas.DataFrame.isin：

new_df = df1[(~df1.isin(df2)).any(1)]
print(new_df)

输出：

   col1  col2  col3
1     2     4     6

Answer 2

您可以通过以下方式直接与index合作：

df1.iloc[df1.index.difference(dff.index), :]

哪个会给：

   col1  col2  col3
1     2     4     6

Answer 3

# Creating the dataframes
import pandas as pd

# Creating the first dataframe
d1 = {'col1': [1, 2], 'col2': [3, 4], 'col3': [5,6]}  
df1 = pd.DataFrame(data=d1)    

# Creating the second dataframe
d2 = {'col1': [1, 3], 'col2': [3, 4], 'col3': [5,6]}    
df2 = pd.DataFrame(data=d2)  
df1
   col1  col2  col3
0     1     3     5
1     2     4     6
df2
   col1  col2  col3
0     1     3     5
1     3     4     6


dff = df1.merge(df2,on=['col1','col2'])
dff
   col1  col2  col3_x  col3_y
0     1     3       5       5

dfs1 = df1[(~df1.col1.isin(dff.col1))&(~df1.col2.isin(dff.col2))]
    dfs1
   col1  col2  col3
1     2     4     6

dfs2 = df2[(~df2.col1.isin(dff.col1))&(~df2.col2.isin(dff.col2))]
dfs2
   col1  col2  col3
1     3     4     6

如果熊猫数据框中的其他行存在，请删除它

3 个答案: