如何创建具有多个具有相同名称/标识符的列的Pandas DataFrame

时间:2019-02-10 22:02:50

标签: python pandas

是否可以创建带有多维列的Pandas DataFrame(或类似的东西)(如果这样还行)?我想做的是以下事情:

a = rand(10,1)
b = rand(10,1)
c = rand(10,1)
z = rand(10,3)
X = hstack((a,b,c,z))
dfX = DataFrame(X, columns=['a','b','c','z'])

由于z的尺寸,以上显然不起作用。我的代码需要将X(或dfX.values)格式化为矩阵格式,但是我希望能够通过dfX['a']而不是X[:,0]处理数据。前者读起来更好。我正在寻找的东西会使dfX['z'].shape成为(10,3)

这甚至可能吗?有没有比Pandas DataFrame更好的解决方案?

1 个答案:

答案 0 :(得分:2)

如果我理解正确-您正在询问多层列:

In [116]: cols = pd.MultiIndex.from_tuples([('a',''),('b',''),('c',''),('z','1'),('z','2'),('z','3')])

In [117]: df = pd.DataFrame(X, columns=cols)

In [118]: df
Out[118]:
          a         b         c         z
                                        1         2         3
0  0.537156  0.660093  0.327486  0.669400  0.677605  0.174052
1  0.787965  0.983033  0.615065  0.720758  0.853734  0.724249
2  0.587206  0.841086  0.781004  0.676756  0.177496  0.845777
3  0.174780  0.620644  0.338403  0.336302  0.508739  0.210462
4  0.132288  0.765768  0.009254  0.155105  0.548964  0.618722
5  0.980484  0.844023  0.779290  0.462613  0.562098  0.571654
6  0.082263  0.511944  0.003198  0.359354  0.531740  0.870077
7  0.805626  0.745733  0.251047  0.737418  0.532125  0.208116
8  0.906066  0.827050  0.434911  0.869463  0.089989  0.074839
9  0.146566  0.960262  0.957117  0.269052  0.086013  0.558531

结果:

In [119]: df.loc[:, 'a']
Out[119]:
0    0.537156
1    0.787965
2    0.587206
3    0.174780
4    0.132288
5    0.980484
6    0.082263
7    0.805626
8    0.906066
9    0.146566
Name: a, dtype: float64

In [120]: df.loc[:, 'z']
Out[120]:
          1         2         3
0  0.669400  0.677605  0.174052
1  0.720758  0.853734  0.724249
2  0.676756  0.177496  0.845777
3  0.336302  0.508739  0.210462
4  0.155105  0.548964  0.618722
5  0.462613  0.562098  0.571654
6  0.359354  0.531740  0.870077
7  0.737418  0.532125  0.208116
8  0.869463  0.089989  0.074839
9  0.269052  0.086013  0.558531

In [121]: df.loc[:, ('z','2')]
Out[121]:
0    0.677605
1    0.853734
2    0.177496
3    0.508739
4    0.548964
5    0.562098
6    0.531740
7    0.532125
8    0.089989
9    0.086013
Name: (z, 2), dtype: float64