根据近似或完全匹配合并两个Pandas DataFrame

时间:2018-09-24 17:00:29

标签: python pandas dataframe merge

下面是我要合并的DataFrames示例。

#!/usr/bin/env python

import pandas as pd

countries   = ['Germany', 'France', 'Indonesia']
rank_one    = [1, 5, 7]
capitals    = ['Berlin', 'Paris', 'Jakarta']
df1         = pd.DataFrame({'country': countries,
                            'rank_one': rank_one,
                            'capital': capitals})

df1 = df1[['country', 'capital', 'rank_one']]    

population = ['8M', '82M', '66M', '255M']
rank_two   = [0, 1, 6, 9]
df2        = pd.DataFrame({'population': population,
                           'rank_two': rank_two})

df2        = df2[['rank_two', 'population']]

我想基于完全匹​​配或近似匹配来合并两个DataFrame。

如果rank_two is equal to rank_one

OR

rank_two is the closest and next bigger number from rank_one

示例:

df1 :

     country  capital  rank_one
0    Germany   Berlin         1
1     France    Paris         5
2  Indonesia  Jakarta         7

df2 :

   rank_two population
0         0         8M
1         1        82M
2         6        66M
3         9       255M

df3_result :

     country  capital  rank_one  rank_two population
0    Germany   Berlin         1         1        82M
1     France    Paris         5         6        66M
2  Indonesia  Jakarta         7         9       255M

2 个答案:

答案 0 :(得分:6)

通过使用merge_asof

pd.merge_asof(df1,df2,left_on='rank_one',right_on='rank_two',direction='forward')
Out[1206]: 
     country  capital  rank_one  rank_two population
0    Germany   Berlin         1         1        82M
1     France    Paris         5         6        66M
2  Indonesia  Jakarta         7         9       255M

答案 1 :(得分:2)

您可以使用熊猫的“ merge_asof”功能

main.js

或者,如果您想按最接近的位置合并,并且不介意它的高低,则可以使用:

pd.merge_asof(df1, df2, left_on="rank_one", right_on="rank_two", direction='forward')