访问世界银行数据指标并使用熊猫数据框

时间:2018-10-19 11:02:36

标签: python pandas

我正在尝试访问世界银行的一系列健康指标数据。

要访问世界银行数据,请使用以下代码:

进口:

import wbdata
import datetime

查看不同的指标:

wbdata.get_indicator(source=16) #Source 16 gives indicators for health.

这将返回以下内容:

SP.DYN.TFRT.IN          Fertility rate, total (births per woman)
SP.DYN.SMAM.MA          Mean age at first marriage, male
SP.DYN.SMAM.FE          Mean age at first marriage, female

要访问特定国家或地区在一段时间内的数据,请使用以下代码:

data_dates = (datetime.datetime(2015,1,1), datetime.datetime(2015,1,1))

top_20_data = wbdata.get_dataframe({'SP.DYN.TFRT.IN':'Fertility rate, total (births per woman)','SP.DYN.SMAM.MA':'Mean age at first marriage, male'}, 
                            country=('BE','BG','CZ','DK','DE','EE','IE','GR','ES','FR','HR','IT','CY','LV','LT','LU',
                                     'HU','MT','NL','AT','PL','PT','RO','SI','SK','FI','SE','GBR'), 
                            data_date=data_dates, 
                            convert_date=False, keep_levels=True)

我要做的是将每个指标输入数据框和每个描述。

我试图做的是创建一个小样本熊猫数据框:

data = {'Indicator': ['SP.DYN.TFRT.IN', 'SP.DYN.SMAM.MA', 'SP.DYN.SMAM.MA'],
 'Description': ['Fertility rate, total (births per woman)', 'Mean age at first marriage, male', 'Mean age at first marriage, female']}

df = pd.DataFrame(data, columns=['Indicator', 'Description']) 

并将其传递给wdata.get_daframe,如下所示:

top_20_data = wbdata.get_dataframe({df['Indicator']:df['Description']}, 
                            country=('BE','BG','CZ','DK','DE','EE','IE','GR','ES','FR','HR','IT','CY','LV','LT','LU',
                                     'HU','MT','NL','AT','PL','PT','RO','SI','SK','FI','SE','GBR'), 
                            data_date=data_dates, 
                            convert_date=False, keep_levels=True)

但是我收到以下错误:

TypeError: 'Series' objects are mutable, thus they cannot be hashed

我在线上看过,但是没有发现任何特别有用的东西。

1 个答案:

答案 0 :(得分:2)

DataFrame转换为字典:

d = dict(df.values)
#another solution
#d = df.set_index('Indicator')['Description'].to_dict()
top_20_data = wbdata.get_dataframe(d, 
                            country=('BE','BG','CZ','DK','DE','EE','IE','GR','ES','FR','HR','IT','CY','LV','LT','LU',
                                     'HU','MT','NL','AT','PL','PT','RO','SI','SK','FI','SE','GBR'), 
                            data_date=data_dates, 
                            convert_date=False, keep_levels=True)