如何将2个Pandas Dataframes(已经被旋转)与列上的分层索引合并在一起

时间:2018-04-14 03:10:42

标签: python pandas dataframe merge pivot-table

我使用JSON

从WHO API获得了2个数据帧

这是我用来获取数据的代码:

import requests                 
import pandas as pd
import json
from pandas import read_html
from pandas.io.json import json_normalize   
import urllib2, json 
import html5lib

#Measles - Number of deaths of children < 5 by country & year
url = "http://apps.who.int/gho/athena/data/GHO/MORT_100.json?profile=simple&filter=COUNTRY:*;CHILDCAUSE:CH6"
response2 = urllib2.urlopen(url)
response_json2 = json.loads(response2.read())   
dfWHO2 = json_normalize(response_json2['fact'])
dfWHO2 = dfWHO2.loc[dfWHO2['dim.AGEGROUP']== '0-4 years']
WHOMeaslesChildhoodDeaths = dfWHO2.pivot('dim.COUNTRY','dim.YEAR','Value').astype(float)

#Measles First Dose Vaccination rate
url = "http://apps.who.int/gho/athena/data/GHO/WHS8_110.json?profile=simple&filter=COUNTRY:*"
response = urllib2.urlopen(url)
response_json = json.loads(response.read())
dfWHO1 = json_normalize(response_json['fact']).pivot('dim.COUNTRY','dim.YEAR','Value').astype(float)
WHOMeasles1stVaccRate = dfWHO1
WHOMeasles1stVaccRate = WHOMeasles1stVaccRate.loc[:,'2000':'2016']

这给了我2个数据帧: Measles - Number of deaths of children &安培; Measles First Dose Vaccination rate

我想要的是在最后得到一个看起来像这样的数据帧: Combined Rate and Deaths

我不太确定如何接近它。

我做了一个背包:

temp1 = WHOMeasles1stVaccRate.unstack() 
temp2 = WHOMeaslesChildhoodDeaths.unstack()
temp1 + temp2

这给了我2个系列,然后我连接了 - 但当然他们没有任何东西来分开&#34;率&#34;来自&#34;死亡&#34;所以输出看起来像这样:

dim.YEAR  dim.COUNTRY                                         
2000      Afghanistan                                             10607
          Albania                                                    97
          Algeria                                                  1630
          Andorra                                                    97
          Angola                                                   1572
          Antigua and Barbuda                                        95
          Argentina                                                  91
          Armenia                                                    95
          Australia                                                  91
          Austria                                                    75
          Azerbaijan                                                 90
          Bahamas                                                    93
          Bahrain                                                    98

我意识到我错过了如何解决这个问题 - 任何帮助都会感激不尽。

2 个答案:

答案 0 :(得分:3)

您需要concat + swaplevel

Newdf=pd.concat([WHOMeasles1stVaccRate,WHOMeaslesChildhoodDeaths],axis=1,keys=['Rate','Deaths']).swaplevel(1,0,axis=1).sort_index(1)

答案 1 :(得分:1)

Python 3在构造函数中解压缩。
这很有趣!我使用@Wen的答案。

ndf = pd.DataFrame({
    **{(k, 'Rate'): v for k, v in WHOMeasles1stVaccRate.items()},
    **{(k, 'Deaths'): v for k, v in WHOMeaslesChildhoodDeaths.items()},
})

ndf.iloc[:10, :10]

                        2000           2001          2002          2003         2004      
                      deaths  rate   deaths  rate  deaths  rate  deaths  rate deaths  rate
dim.COUNTRY                                                                               
Afghanistan          10580.0  27.0  14120.0  37.0  6891.0  35.0   225.0  39.0  367.0  48.0
Albania                  2.0  95.0      0.0  95.0     0.0  96.0     0.0  93.0    0.0  96.0
Algeria               1550.0  80.0   1616.0  83.0  1646.0  81.0  2567.0  84.0   38.0  81.0
Andorra                  0.0  97.0      0.0  97.0     0.0  98.0     0.0  96.0    0.0  98.0
Angola                1536.0  36.0   4643.0  65.0  6061.0  66.0  2238.0  52.0   36.0  52.0
Antigua and Barbuda      0.0  95.0      0.0  97.0     0.0  99.0     0.0  99.0    0.0  97.0
Argentina                0.0  91.0      0.0  89.0     0.0  95.0     0.0  97.0    0.0  99.0
Armenia                  3.0  92.0      3.0  93.0     3.0  91.0     3.0  94.0    3.0  92.0
Australia                0.0  91.0      0.0  92.0     0.0  94.0     0.0  94.0    0.0  94.0
Austria                  0.0  75.0      0.0  79.0     0.0  78.0     0.0  79.0    0.0  74.0