熊猫 - 检查任何不匹配的记录

时间:2018-04-23 12:50:38

标签: python pandas dataframe

我正在尝试使用以下条件进行匹配:

输入:

If df1['Cntr_No'] == df2['Cntr_No']
    check if df1['Total_Amount'] == df2['Total_Amount']
      else check if df1['Total_Amount'] == df2['Amount2'] or == df2['Amount3'] 

 If a match to create a new column "Match" with value = "Yes" or "No" for unmatched.

示例数据:

df1 = pd.DataFrame({'Cntr_No': ['HLBU 1234567'],'Total_Amount': 100})
df2 = pd.DataFrame({'Cntr_No': ['HLBU 1234567'],'Total_Amount': 50,'Amount_2': 40, 'Amount_3':100})

连续示例输出:

    df1: HLBU 1234567 | df1: Total Amount: 100 | df2: HLBU 1234567 | df2: 
    Total Amount: 50 | df2: Amount 2: 40 | df2: Amount 3: 100 | Matched

2 个答案:

答案 0 :(得分:2)

一种方法是使用字典映射,然后使用列表解析:

cols = ['Amount_2', 'Amount_3', 'Total_Amount']

d = {k: set(v.values()) for k, v in \
        df2.set_index('Cntr_No')[cols].to_dict(orient='index').items()}

df1['Check'] = [j in d.get(i, set()) for i, j in zip(df1['Cntr_No'], df1['Total_Amount'])]

df1['Check'] = df1['Check'].map({True: 'Match', False: 'No'})

结果:

        Cntr_No  Total_Amount  Check
0  HLBU 1234567           100  Match

答案 1 :(得分:1)

我认为使用isin

很简单
In [504]: df2['Check'] = ((df1.Cntr_No.isin(df2.Cntr_No))&((df1.Total_Amount.isin(df2.Amount_2))|(df1.Total_Amount.isin(df2.Amount_3))|(df1.Total_Amount.isin(df2.Total_Amount)))).map({True:'Match',False:'No'})

In [505]: df2
Out[505]: 
   Amount_2  Amount_3       Cntr_No  Total_Amount  Check
0        40       100  HLBU 1234567            50  Match
相关问题