如果特定列的值在两个数据帧中都匹配,则将一个数据帧的行复制到另一数据帧

时间:2019-11-11 14:19:42

标签: python-3.x pandas dataframe

由于原始数据集很大,我在此处提供示例数据来解释我的问题:

import pandas as pd

data_a = {'Buyer':['Company1','Company2','Company3','Company4','Company5','Company6','Company7','Company8'], 
          'Seller':['Company9','Company10','Company11','Company12','Company13','Company14','Company15',
                    'Company16']}
a_df = pd.DataFrame(data_a)

data_b = {'Buyer':['Company7','Company2','Company1','Company3','Company5'], 
          'Seller':['Company15','Company7','Company9','Company11','Company10'], 
          'Company_Number':[1,2,3,4,5],'Date':['01-01-11','02-02-12','03-03-13','04-04-14','05-05-15'],
          'Deal':['Success','Failure','Success','Success','Ongoing']}
b_df = pd.DataFrame(data_b)
print(b_df)

控制台输出

a_df = 
      Buyer     Seller
0  Company1   Company9
1  Company2  Company10
2  Company3  Company11
3  Company4  Company12
4  Company5  Company13
5  Company6  Company14
6  Company7  Company15
7  Company8  Company16
b_df = 
      Buyer     Seller  Company_Number      Date     Deal
0  Company7  Company15               1  01-01-11  Success
1  Company2   Company7               2  02-02-12  Failure
2  Company1   Company9               3  03-03-13  Success
3  Company3  Company11               4  04-04-14  Success
4  Company5  Company10               5  05-05-15  Ongoing

现在,如果“买方”和“卖方”匹配,我想将数据框“ b_df”中的“公司编号”,“日期”和“交易”行复制到“ a_df”。注意,两个数据帧中的匹配索引不必相同。预期结果应如下:

a_df =
      Buyer     Seller Company_Number      Date     Deal
0  Company1   Company9              3  03-03-13  Success
1  Company2  Company10            NaN       NaN      NaN
2  Company3  Company11              4  04-04-14  Success
3  Company4  Company12            NaN       NaN      NaN
4  Company5  Company13            NaN       NaN      NaN
5  Company6  Company14            NaN       NaN      NaN
6  Company7  Company15              1  01-01-11  Success
7  Company8  Company16            NaN       NaN      NaN

1 个答案:

答案 0 :(得分:1)

您可以使用dataFrame.merge()功能。

pd.merge(a_df,b_df,how='left')

输出:

      Buyer     Seller  Company_Number      Date     Deal
0  Company1   Company9             3.0  03-03-13  Success
1  Company2  Company10             NaN       NaN      NaN
2  Company3  Company11             4.0  04-04-14  Success
3  Company4  Company12             NaN       NaN      NaN
4  Company5  Company13             NaN       NaN      NaN
5  Company6  Company14             NaN       NaN      NaN
6  Company7  Company15             1.0  01-01-11  Success
7  Company8  Company16             NaN       NaN      NaN