鉴于通过将其他数据帧与完全相同的列/行连接而创建的数据帧,如何获得所有键的所有列?
这是一个具体的例子:
In [9]: df = pd.DataFrame(np.random.randn(nrow, ncol), columns=list(string.uppercase[:ncol]))
In [10]: df
Out[10]:
A B C D E
0 -2.445083 0.020886 -0.518002 -1.087649 -2.457616
1 -0.834116 -0.000645 -0.052698 1.017388 0.977475
2 -0.043448 0.348393 -0.846228 -1.144556 1.472701
3 0.359526 -1.723547 -1.659162 0.173996 0.315652
4 1.100312 -0.681820 -1.065581 0.153885 0.398029
5 -2.992605 0.322006 0.097947 -0.514609 -0.871674
6 1.981342 0.147712 0.497502 0.547683 1.070719
7 0.281246 -0.198311 0.564416 -0.762356 0.763791
8 -0.913407 0.927109 0.348485 3.364223 2.602642
9 -0.644116 2.095727 1.125958 0.296914 -0.420522
In [11]: pieces = []
In [12]: for i in range(4):
....: pieces.append(pd.DataFrame(np.random.randn(nrow,ncol), columns=list(string.uppercase[:ncol]))
....:
In [13]: df_concat = pd.concat(pieces, keys=['W','X','Y','Z'], axis=1)
In [14]: df_concat
Out[14]:
W X \
A B C D E A B
0 -0.505484 -0.457853 -0.990727 -0.780617 1.215694 0.450981 -1.633229
1 0.116248 0.235593 -0.339177 0.358038 0.583175 1.699095 -0.238950
2 -0.000709 -2.145297 1.041371 -0.046306 0.308357 1.098283 0.020833
3 0.301729 -0.385389 -0.247188 -1.212048 1.344364 0.271609 -0.570161
4 -0.965596 0.030255 0.677786 -0.272460 0.074819 -1.129305 -1.367137
5 0.712317 -0.888795 -1.096789 -0.606129 -1.048819 -2.629423 1.298547
6 -0.743539 0.040812 -0.802773 0.743799 0.430384 -0.902586 0.082162
7 -0.587438 -1.298439 -1.130855 -1.860293 1.802137 -0.071374 2.002444
8 0.060809 -0.279892 0.316728 0.413448 -0.564599 -0.127618 0.628813
9 1.142441 1.224539 0.572980 0.037514 -0.513964 -1.026794 0.899758
Y \
C D E A B C D
0 -0.953875 -0.656037 -1.083118 -0.706460 -0.542555 0.028699 1.100427
1 -0.812239 -0.758029 0.365095 0.132736 1.161346 -1.372225 -1.780733
2 -1.347575 1.524654 0.031564 0.651127 -0.751353 0.770411 0.317422
3 -1.269158 0.590106 0.007470 -1.068919 -0.748173 -0.495151 0.304920
4 0.488790 -0.067784 -1.154394 -1.795902 0.315138 -0.243877 0.698870
5 0.296125 -0.010721 0.984436 -1.692544 0.703791 0.898088 2.379869
6 1.580341 -0.984228 -1.141533 -0.950717 -1.158840 0.149764 1.136630
7 1.216956 -0.429757 0.376067 0.417440 0.331015 -0.837385 -0.984118
8 -1.508074 -0.483468 0.297295 0.253952 -0.356498 -0.193768 -0.954337
9 -0.951482 -0.020037 -1.888375 -1.052739 -0.996700 -0.758079 -0.239132
Z
E A B C D E
0 -0.736567 1.451512 -0.877736 -0.826044 0.850919 0.005778
1 -0.327570 -1.706155 1.359768 0.808397 1.697910 0.109116
2 0.932116 0.361915 -0.460502 -0.344834 -1.792748 0.722837
3 0.567515 -0.440755 0.850031 -0.091985 -0.296515 -0.078628
4 0.210144 0.617150 1.017416 0.552831 -1.757228 0.983008
5 -0.134114 -1.137423 0.256443 -1.015701 0.972131 -1.686675
6 0.376023 -0.195116 2.127337 -0.687416 0.425428 2.378165
7 -0.082692 1.686996 -1.857700 0.638241 0.551779 -0.486632
8 2.148983 0.188987 -0.387614 0.833069 1.240079 -0.031077
9 -1.278626 -1.219897 -0.173212 -0.119734 -0.244129 1.940811
如何获取所有键的“A”列?我尝试做同样的事情,但使用Panels,但再次,它需要第一个键。如果我只想要所有的键怎么办?
In [18]: p = pd.Panel.from_dict(dict(zip(['W','X','Y','Z'], [pd.DataFrame(np.random.randn(nrow, ncol), columns=list(string.uppercase[:ncol])) for i in range(4)])))
In [19]: p
Out[19]:
<class 'pandas.core.panel.Panel'>
Dimensions: 4 (items) x 10 (major_axis) x 5 (minor_axis)
Items axis: W to Z
Major_axis axis: 0 to 9
Minor_axis axis: A to E
我想要的最终输出是所有行的10x4数据帧乘以所有样本的'A'列。到目前为止,我一直在做的是从每个数据帧中手动提取一列,然后将它们连接在一起以形成10x4数据帧,例如
In [35]: a_pieces = [df_concat[x].ix[:,'A'] for x in ['W','X','Y','Z']]
In [36]: a_concat = pd.concat(a_pieces, keys=['W','X','Y','Z'], axis=1)
In [37]: a_concat
Out[37]:
W X Y Z
0 -0.505484 0.450981 -0.706460 1.451512
1 0.116248 1.699095 0.132736 -1.706155
2 -0.000709 1.098283 0.651127 0.361915
3 0.301729 0.271609 -1.068919 -0.440755
4 -0.965596 -1.129305 -1.795902 0.617150
5 0.712317 -2.629423 -1.692544 -1.137423
6 -0.743539 -0.902586 -0.950717 -0.195116
7 -0.587438 -0.071374 0.417440 1.686996
8 0.060809 -0.127618 0.253952 0.188987
9 1.142441 -1.026794 -1.052739 -1.219897
答案 0 :(得分:2)
应该能够用xs取出切片
df_concat.xs('A', level=1, axis=1)
答案 1 :(得分:1)
这是你在寻找什么?
In [20]: df_concat.swaplevel(1,0,axis=1)['A']
Out[20]:
W X Y Z
0 -1.040162 0.220310 0.493406 0.224235
1 0.093167 1.554220 1.626530 0.068452
2 0.700489 0.563523 0.882834 0.263289
3 0.148377 0.012024 -0.871754 0.428075
4 -0.812572 -0.194886 1.234637 1.174096
5 -0.226345 -0.211326 0.688867 -0.992412
6 -1.348947 -1.319374 -0.693617 1.069359
7 -0.336275 1.191541 0.681850 0.259941
8 -1.029588 -1.260796 0.184852 -0.136066
9 0.115574 -0.075612 0.777306 -0.874591