如果熊猫数据框中的其他行存在,请删除它

时间:2020-02-07 10:27:17

标签: python python-3.x pandas

我正在尝试根据几列匹配2个数据帧。在此之后,我想从原始行中删除匹配的行,但无法获得所需的行。我要这样做的原因是,如果最后一场失败了,我将尝试几次比赛。

这是我的尝试:

import pandas as pd

# Creating the first dataframe
d1 = {'col1': [1, 2], 'col2': [3, 4], 'col3': [5,6]}  
df1 = pd.DataFrame(data=d1)    

# Creating the second dataframe
d2 = {'col1': [1, 3], 'col2': [3, 4], 'col3': [5,6]}    
df2 = pd.DataFrame(data=d2)   

这是df1给我的:

   col1  col2  col3
0     1     3     5
1     2     4     6

这对于df2:

   col1  col2  col3
0     1     3     5
1     3     4     6

加入步骤:

# Inner join to see the matches
fields = ['col1', 'col2']  
dff = df1.merge(df2, how='inner', on=fields)   

# Remove from df1 and df2 the matches ones
dfs1 = df1[~df1[fields].isin(dff)]  
dfs2 = df2[~df2[fields].isin(dff)]  

例如,这是我为dfs1获得的结果:

   col1  col2  col3
0   NaN   NaN   NaN
1   2.0   4.0   NaN

这是我期望的结果:

   col1  col2  col3
0     2     4     6

有什么想法吗? :)

谢谢!

3 个答案:

答案 0 :(得分:2)

使用pandas.DataFrame.isin

new_df = df1[(~df1.isin(df2)).any(1)]
print(new_df)

输出:

   col1  col2  col3
1     2     4     6

答案 1 :(得分:1)

您可以通过以下方式直接与index合作:

df1.iloc[df1.index.difference(dff.index), :]

哪个会给:

   col1  col2  col3
1     2     4     6

答案 2 :(得分:0)

# Creating the dataframes
import pandas as pd

# Creating the first dataframe
d1 = {'col1': [1, 2], 'col2': [3, 4], 'col3': [5,6]}  
df1 = pd.DataFrame(data=d1)    

# Creating the second dataframe
d2 = {'col1': [1, 3], 'col2': [3, 4], 'col3': [5,6]}    
df2 = pd.DataFrame(data=d2)  
df1
   col1  col2  col3
0     1     3     5
1     2     4     6
df2
   col1  col2  col3
0     1     3     5
1     3     4     6


dff = df1.merge(df2,on=['col1','col2'])
dff
   col1  col2  col3_x  col3_y
0     1     3       5       5

dfs1 = df1[(~df1.col1.isin(dff.col1))&(~df1.col2.isin(dff.col2))]
    dfs1
   col1  col2  col3
1     2     4     6

dfs2 = df2[(~df2.col1.isin(dff.col1))&(~df2.col2.isin(dff.col2))]
dfs2
   col1  col2  col3
1     3     4     6