如何用熊猫映射两个数据框

时间:2019-07-04 16:08:16

标签: python pandas dataframe mapping

我有两个excel文件

+ File one contains specific data about different customer (like: Sex, Age, Name...) and 
+ File two contains different transactions for each customer

我想在File2中创建一个新列,其中包含来自File1的每个Costumer的特定数据

2 个答案:

答案 0 :(得分:0)

file1.csv

customer_id,sex,age,name
af4wf3,m,12,mike
z20ask,f,15,sam

file2.csv

transaction_id,customer_id,amount
12h2j4hk,af4wf3,123.20
12h2j4h1,af4wf3,5.22
12h2j4h2,z20ask,13.20
12h2j4h3,af4wf3,1.20
12h2j4h4,z20ask,2341.12
12h2j4h5,z20ask,235.96
12h2j4h6,af4wf3,999.30

加载并加入数据框

import pandas as pd

df1 = pd.read_csv('file1.csv')
df2 = pd.read_csv('file2.csv')

df1.set_index('customer_id', inplace=True)
df2.set_index('transaction_id', inplace=True)

output = df2.join(df1, on='customer_id')
output.to_csv('file2_updated.csv')

file2_updated.csv

transaction_id,customer_id,amount,sex,age,name
12h2j4hk,af4wf3,123.2,m,12,mike
12h2j4h1,af4wf3,5.22,m,12,mike
12h2j4h2,z20ask,13.2,f,15,sam
12h2j4h3,af4wf3,1.2,m,12,mike
12h2j4h4,z20ask,2341.12,f,15,sam
12h2j4h5,z20ask,235.96,f,15,sam
12h2j4h6,af4wf3,999.3,m,12,mike

答案 1 :(得分:0)

与@ jc416相同,但使用pd.merge

file2.merge(file1, on='customer_id')

    transaction_id  customer_id  amount     sex  age    name
0   12h2j4hk        af4wf3       123.2      m    12     mike
1   12h2j4h1        af4wf3       5.22       m    12     mike
2   12h2j4h3        af4wf3       1.2        m    12     mike
3   12h2j4h6        af4wf3       999.3      m    12     mike
4   12h2j4h2        z20ask       13.2       f    15     sam
5   12h2j4h4        z20ask       2341.12    f    15     sam
6   12h2j4h5        z20ask       235.96     f    15     sam

您绝对应该阅读Pandas merging 101