我有2个不同大小的文件(customer_id在两个文件中的顺序不同):
data = pd.read_csv('data.csv')
id name country town customer_id
xxxx Anna UK London sahdghkl
yyyy Maria USA Huston avrnnfgs
cccc Peter FR Paris eesfawsd
data2 = pd.read_csv('data2.csv')
customer_id card_id bank date
sahdghkl 5975845 aaaaa 20000101
avrnnfgs 1122255 bbbbb 20010101
eesfawsd 3366552 ccccc 20020101
我想得到输出:
result
id name country town customer_id card_id bank date
xxxx Anna UK London sahdghkl 5975845 aaaaa 20000101
yyyy Maria USA Huston avrnnfgs 1122255 bbbbb 20010101
cccc Peter FR Paris eesfawsd 3366552 ccccc 20020101
答案 0 :(得分:0)
尝试使用pandas.merge
temp = u"""id name country town customer_id
xxxx Anna UK London sahdghkl
yyyy Maria USA Huston avrnnfgs
cccc Peter FR Paris eesfawsd"""
data = pd.read_csv(io.StringIO(temp), header=0,delim_whitespace = 1)
temp = u"""customer_id card_id bank date
sahdghkl 5975845 aaaaa 20000101
avrnnfgs 1122255 bbbbb 20010101
eesfawsd 3366552 ccccc 20020101"""
data2 = pd.read_csv(io.StringIO(temp), header=0,delim_whitespace = 1)
df = pd.merge(data,data2,on = 'customer_id')
print df
id name country town customer_id card_id bank date
0 xxxx Anna UK London sahdghkl 5975845 aaaaa 20000101
1 yyyy Maria USA Huston avrnnfgs 1122255 bbbbb 20010101
2 cccc Peter FR Paris eesfawsd 3366552 ccccc 20020101
如果您的两个数据框中有一个的行数多于另一个,并且您希望保留所有行,请添加how = 'outer'
,如果您只想保留数据框中出现的行,请添加:{{ 1}}