我正在尝试将一个数据帧(df1)中的2列与另一数据帧(df2)中的2列进行比较。比较之后,我想选择前两列不匹配的行。您可以在下面看到我的尝试,这是数据帧的外观[[1]
import pandas as pd
fd1= 'Q37.xlsx'
fd2= 'Q43.xlsx'
df1 = pd.read_excel( fd1, sheetname='prio 1')
df2 = pd.read_excel( fd2, sheetname='prio 1')
closed_items= {} #items in fd1 but not in fd2
new_items={} #items in fd2 but not in fd1
为了获得closed_items,我尝试了以下三件事
closed_items.where(df1[df1['Code'].values!=df2[df2['Code'].values and
df1['Owner'].values != key in df1['Owner'].values)
得到
ValueError: Can only compare identically-labeled Series objects
我也尝试过
Closed_items = df2.loc[(df2['Code'] != df1['Code']) and
df2.loc[(df2['Owner'] != df1['Owner'])]
最后我尝试了
for key in df1['Code'].values:
if key in df1['Code'].values != key in df1['Code'].values or key in
df1['Owner'].values != key in df1['Owner'].values:
closed_items.append()
else:
pass
哪个给出了这种语法
The truth value of an array with more than one element is ambiguous.
Use a.any() or a.all()
...
AFP= pd.ExcelWriter("AFP.xlsx", engine='xlsxwriter')
closed_items.to_excel(AFP, sheet_name='Closed', index=False)
答案 0 :(得分:0)
问题在于df1和df2的形状不同,因此该位置无法正常工作。 首先,您需要像dp1和df2那样合并
df3 = df1.merge(df2,on='common_key',how='left',suffixes=('_df1','_df2'))
df3['select'] = 0
df3.loc[(df3['Code_df1'] == df3['Code_df2']) &
(df3.loc[(df3['Owner_df1'] == df3['Owner_df2']),'select'])] = 1
df3.loc[df3['select']==0,:]
在不匹配的地方返回