假设我有一个MultiIndex数据框,如:
In [1]: arrays = [['one','one','one','two','two','two'],[1,2,3,1,2,3]]
In [2]: df = pa.DataFrame(randn(6,1),index=pa.MultiIndex.from_tuples(zip(*arrays)),columns=['A'])
In [3]: df
Out[3]:
A
one 1 0.229037
2 -1.640695
3 0.908127
two 1 -0.918750
2 1.170112
3 -2.620850
我想将此更改为新的数据框,列是MultiIndex数据帧的第一级索引?有一个简单的方法吗? (下面一个例子)
In [12]: dft = df.ix['one']
In [13]: dft = dft.rename(columns={'A':'one'})
In [14]: dft['two'] = df.ix['two']['A']
In [15]: dft
Out[15]:
one two
1 0.229037 -0.918750
2 -1.640695 1.170112
3 0.908127 -2.620850
答案 0 :(得分:8)
也许您正在寻找pandas.unstack:
In [56]: df
Out[56]:
A
one 1 0.229037
2 -1.640695
3 0.908127
two 1 -0.918750
2 1.170112
3 -2.620850
In [57]: df.unstack(level=0)
Out[57]:
A
one two
1 0.229037 -0.918750
2 -1.640695 1.170112
3 0.908127 -2.620850
答案 1 :(得分:2)
只是为此添加内容,还有另一种选择,即使用reset_index()
函数将多索引创建到列中。这里的区别在于它只是将值“弹出”为新列。取决于您的用例:
In [5]: df
Out[5]:
A
one 1 -1.598591
2 -0.354813
3 -0.435924
two 1 1.408328
2 0.448303
3 0.381360
In [6]: df.reset_index()
Out[6]:
level_0 level_1 A
0 one 1 -1.598591
1 one 2 -0.354813
2 one 3 -0.435924
3 two 1 1.408328
4 two 2 0.448303
5 two 3 0.381360