使用熊猫加入2个具有不同列的多索引数据框

时间:2019-02-23 15:26:20

标签: pandas dataframe

我有2帧:

df1 = pd.DataFrame({'Country': [ 'US', 'IT', 'FR'],
                  'Location': [ 'Hawai', 'Torino', 'Paris'],
                  '2000': [20, 40,60],
                    '2002': [100,200,300]

                   })
df1.set_index(['Country','Location'],inplace=True)

df2 = pd.DataFrame({'Country': [ 'US', 'IT', 'FR','GB'],
                '2002': [2, 4,3,6],
                  '2018': [6, 88,7,90]
                   })
df2.set_index(['Country'],inplace=True)  

我想计算普通年(列)2中的比率

                  2000  2002
Country Location            
US      Hawai       20   100
IT      Torino      40   200
FR      Paris       60   300
         2002  2018
Country            
US          2     6
IT          4    88
FR          3     7
GB          6    90

该比例应产生

                      2002
    Country Location           
    US      Hawai      50
    IT      Torino     50
    FR      Paris      100  

尝试了几种连接方式,但无法实现。有什么想法吗?

1 个答案:

答案 0 :(得分:1)

在第一级使用DataFrame.div

df = df1.div(df2, level=0)
print (df)
                  2000   2002  2018
Country Location                   
US      Hawai      NaN   50.0   NaN
IT      Torino     NaN   50.0   NaN
FR      Paris      NaN  100.0   NaN

如果需要删除所有NaN列(不在两个DataFrame中的列):

df = df1.div(df2, level=0).dropna(axis=1, how='all')
print (df)
                   2002
Country Location       
US      Hawai      50.0
IT      Torino     50.0
FR      Paris     100.0

另一种解决方案是先获取intersectionDataFrame中的两个列,然后在除法之前进行过滤:

c = df1.columns.intersection(df2.columns)
print (c)
Index(['2002'], dtype='object')

df = df1[c].div(df2[c], level=0)
print (df)
                   2002
Country Location       
US      Hawai      50.0
IT      Torino     50.0
FR      Paris     100.0