使用数据透视表正确整理数据-将索引分为两个级别

时间:2019-02-08 19:00:48

标签: python pandas pivot

我正在努力解决这个问题。我知道如何生成数据透视表,但是我真的很难将索引保持为两级。这是问题,我的代码在下面:

使用pivot_table整理下面table1中的数据,并将结果分配给变量table1_tidy。在这种情况下,将索引保留为两级countryyear

table1columns = ["country",  "year",       "type",     "count"]
table1data =[ ["Afghanistan",  1999,      "cases",       745],
          ["Afghanistan",  1999, "population",  19987071],
          ["Afghanistan",  2000,      "cases",      2666],
          ["Afghanistan",  2000, "population",  20595360],
          [     "Brazil",  1999,      "cases",     37737],
          [     "Brazil",  1999, "population", 172006362],
          [     "Brazil",  2000,      "cases",     80488],
          [     "Brazil",  2000, "population", 174504898],
          [      "China",  1999,      "cases",    212258],
          [      "China",  1999, "population",1272915272],
          [      "China",  2000,      "cases",    213766],
          [      "China",  2000, "population",1280428583] ]

table1 = pd.DataFrame(table1data, columns=table1columns)

### BEGIN SOLUTION
'''
This code uses `pivot_table` to tidy the data below in `table1`, 
assigning the result to the variable `table1_tidy`.
'''
table1_tidy = table1.pivot('type', 'count')
### END SOLUTION
# When done, comment out line below
# raise NotImplementedError()
print(table1_tidy)

我的代码需要传递以下assert语句,但当前未能通过:

assert table1_tidy.shape == (6, 2)
assert table1_tidy.iloc[3, 0] == 80488

1 个答案:

答案 0 :(得分:1)

枢轴为多索引索引提供值错误。 GitHub上有一个相同的打开错误。当前的解决方案是改为使用数据透视表

table1_tidy = table1.pivot_table( index = ['country', 'year'], columns = 'type',values = 'count')



type                cases   population
country     year        
Afghanistan 1999    745     19987071
            2000    2666    20595360
Brazil      1999    37737   172006362
            2000    80488   174504898
China       1999    212258  1272915272
            2000    213766  1280428583

您可以使用set_index获得相同的结果

table1_tidy = table1.set_index(['country', 'year', 'type'])['count'].unstack()