Question

我是Python的新手。我有以下代码：

import wbdata # World Bank's API
import pandas
import matplotlib.pyplot as plt

#countries I want
countries = ["CL","UY","HU"]

#indicators I want
indicators = {'NY.GNP.PCAP.CD':'GNI per Capita'}

#grab indicators above for countries I want and load into data frame
df = wbdata.get_dataframe(indicators, country=countries, convert_date=False)

#list the columns in data frame
list(df.columns.values)

我的数据框的输出和数据框中的列数如下：

In [1]:df
Out[1]: 
                GNI
country date         
Chile   2017  13610.0
        2016  13430.0
        2015  14270.0
        2014  15140.0
        2013  15360.0
        2012  14410.0
        2011  12380.0
              ...
Uruguay 2017  23410.0
        2016  11430.0
        2015  11270.0
        2014  11440.0
        2013  65360.0
        2012  94410.0
        2011  10380.0

[174 rows x 1 columns]

In [2]: list(df.columns.values)
Out[2]: ['GNI']

如您所见，数据帧中只有一列（“ GNI ”）被识别为列。

如何将“ 国家”和“ 日期”识别为列？

我的目标是拥有一个如下所示的面板数据集。其中存在三个变量（= Stata语言）：国家，日期和GNI。而且在“国家/地区”变量中没有空白的地方，因为每个GNI观察值都对应一个国家/地区日期组合。

Country Date   GNI   
Chile   2017  13610.0
Chile   2016  13430.0
Chile   2015  14270.0
Chile   2014  15140.0
Chile   2013  15360.0
Chile   2012  14410.0
Chile   2011  12380.0
              ...
Uruguay 2017  23410.0
Uruguay 2016  11430.0
Uruguay 2015  11270.0
Uruguay 2014  11440.0
Uruguay 2013  65360.0
Uruguay 2012  94410.0
Uruguay 2011  10380.0

[174 rows × 3 columns]

当然，我正在屠杀Python的语法和语言，但是任何帮助或指导都将不胜感激。

Answer 1

由于“国家/地区”和“日期”用作索引（准确地说是MultiIndex），因此您仅将GNI作为列。

您需要的是reset_index：

df = df.reset_index(drop=False)

熊猫只识别我数据框中的一列

1 个答案: