python pandas:合并2个数据帧

时间:2018-04-20 14:23:16

标签: python pandas left-join

我想使用python的pandas加入2个如下所示的数据帧:

customer_orders = pd.DataFrame({'customerID': [1, 2, 2, 1],
                    'customerName': ['John', 'Anna', 'Anna', 'John'],
                    'customerAge': [21, 45, 45, 21],
                    'orderID': [255, 256, 257, 258],
                    'paymentType': ['visa', 'bank', 'master', 'paypal']})

创建:

 customerAge  customerID customerName  orderID paymentType
0           21           1         John      255        visa
1           45           2         Anna      256        bank
2           45           2         Anna      257      master
3           21           1         John      258      paypal

order_products = pd.DataFrame({'orderID': [255, 255, 257, 258, 255, 257],
                           'price': [9.99, 23.40, 15.89, 3.99, 89.50, 23.40],
                           'productName': ['filter', 'cosmetic', 'shampoo', 'tissues', 'elecBrush', 'cosmetic']})

创建:

   orderID  price productName
0      255   9.99      filter
1      255  23.40    cosmetic
2      257  15.89     shampoo
3      258   3.99     tissues
4      255  89.50   elecBrush
5      257  23.40    cosmetic

如下所示 预期输出

 customerAge  customerID customerName  orderID paymentType
           21           1         John      255        visa     255   9.99      filter
           21           1         John      255        visa     255  23.40    cosmetic
           21           1         John      255        visa     255  89.50   elecBrush
           45           2         Anna      256        bank     null  null         null
           45           2         Anna      257      master     257  15.89     shampoo
           45           2         Anna      257      master     257  23.40    cosmetic  
           21           1         John      258      paypal     258   3.99     tissues

据我所知,这是一个SQL左连接。但是使用

all = customer_orders.join(order_products, on="orderID", how='left', lsuffix='_left', rsuffix='_right')

没有给我我想要的东西(太少的行和NaN而不是第二个表的值)。

我错过了什么?

2 个答案:

答案 0 :(得分:4)

左?不,这是一个外部联接。

customer_orders.merge(order_products, on="orderID", how='outer')

   customerAge  customerID customerName  orderID paymentType  price  \
0           21           1         John      255        visa   9.99   
1           21           1         John      255        visa  23.40   
2           21           1         John      255        visa  89.50   
3           45           2         Anna      256        bank    NaN   
4           45           2         Anna      257      master  15.89   
5           45           2         Anna      257      master  23.40   
6           21           1         John      258      paypal   3.99   

  productName  
0      filter  
1    cosmetic  
2   elecBrush  
3         NaN  
4     shampoo  
5    cosmetic  
6     tissues  

答案 1 :(得分:0)

尝试使用merge

all = customer_orders.merge(order_products, on="orderID", how='left')