比较两个数据框中的多列,并选择具有不同值的行

时间:2018-11-27 12:02:53

标签: python excel pandas dataframe conditional-formatting

我正在尝试将一个数据帧(df1)中的2列与另一数据帧(df2)中的2列进行比较。比较之后,我想选择前两列不匹配的行。您可以在下面看到我的尝试,这是数据帧的外观[[1]

import pandas as pd

fd1= 'Q37.xlsx'
fd2= 'Q43.xlsx'
df1 = pd.read_excel( fd1, sheetname='prio 1') 
df2 = pd.read_excel( fd2, sheetname='prio 1')


closed_items= {} #items in fd1 but not in fd2
new_items={}  #items in fd2 but not in fd1

为了获得closed_items,我尝试了以下三件事

closed_items.where(df1[df1['Code'].values!=df2[df2['Code'].values and 
                   df1['Owner'].values != key in df1['Owner'].values)

得到

ValueError: Can only compare identically-labeled Series objects

我也尝试过

Closed_items = df2.loc[(df2['Code'] != df1['Code']) and 
               df2.loc[(df2['Owner'] != df1['Owner'])]

最后我尝试了

for key in df1['Code'].values:
    if key in df1['Code'].values != key in df1['Code'].values or key in 
              df1['Owner'].values != key in df1['Owner'].values:

          closed_items.append()
     else:
           pass 

哪个给出了这种语法

 The truth value of an array with more than one element is ambiguous. 
 Use a.any() or a.all()

...

AFP= pd.ExcelWriter("AFP.xlsx", engine='xlsxwriter')

closed_items.to_excel(AFP, sheet_name='Closed', index=False)

1 个答案:

答案 0 :(得分:0)

问题在于df1和df2的形状不同,因此该位置无法正常工作。 首先,您需要像dp1和df2那样合并

df3 = df1.merge(df2,on='common_key',how='left',suffixes=('_df1','_df2'))
            df3['select'] = 0
df3.loc[(df3['Code_df1'] == df3['Code_df2']) & 
                           (df3.loc[(df3['Owner_df1'] == df3['Owner_df2']),'select'])] = 1

df3.loc[df3['select']==0,:]

在不匹配的地方返回