我正在努力解决这个问题。我知道如何生成数据透视表,但是我真的很难将索引保持为两级。这是问题,我的代码在下面:
使用pivot_table
整理下面table1
中的数据,并将结果分配给变量table1_tidy
。在这种情况下,将索引保留为两级country
和year
。
table1columns = ["country", "year", "type", "count"]
table1data =[ ["Afghanistan", 1999, "cases", 745],
["Afghanistan", 1999, "population", 19987071],
["Afghanistan", 2000, "cases", 2666],
["Afghanistan", 2000, "population", 20595360],
[ "Brazil", 1999, "cases", 37737],
[ "Brazil", 1999, "population", 172006362],
[ "Brazil", 2000, "cases", 80488],
[ "Brazil", 2000, "population", 174504898],
[ "China", 1999, "cases", 212258],
[ "China", 1999, "population",1272915272],
[ "China", 2000, "cases", 213766],
[ "China", 2000, "population",1280428583] ]
table1 = pd.DataFrame(table1data, columns=table1columns)
### BEGIN SOLUTION
'''
This code uses `pivot_table` to tidy the data below in `table1`,
assigning the result to the variable `table1_tidy`.
'''
table1_tidy = table1.pivot('type', 'count')
### END SOLUTION
# When done, comment out line below
# raise NotImplementedError()
print(table1_tidy)
我的代码需要传递以下assert语句,但当前未能通过:
assert table1_tidy.shape == (6, 2)
assert table1_tidy.iloc[3, 0] == 80488
答案 0 :(得分:1)
枢轴为多索引索引提供值错误。 GitHub上有一个相同的打开错误。当前的解决方案是改为使用数据透视表
table1_tidy = table1.pivot_table( index = ['country', 'year'], columns = 'type',values = 'count')
type cases population
country year
Afghanistan 1999 745 19987071
2000 2666 20595360
Brazil 1999 37737 172006362
2000 80488 174504898
China 1999 212258 1272915272
2000 213766 1280428583
您可以使用set_index获得相同的结果
table1_tidy = table1.set_index(['country', 'year', 'type'])['count'].unstack()