是否可以创建带有多维列的Pandas DataFrame(或类似的东西)(如果这样还行)?我想做的是以下事情:
a = rand(10,1)
b = rand(10,1)
c = rand(10,1)
z = rand(10,3)
X = hstack((a,b,c,z))
dfX = DataFrame(X, columns=['a','b','c','z'])
由于z的尺寸,以上显然不起作用。我的代码需要将X(或dfX.values)格式化为矩阵格式,但是我希望能够通过dfX['a']
而不是X[:,0]
处理数据。前者读起来更好。我正在寻找的东西会使dfX['z'].shape
成为(10,3)
。
这甚至可能吗?有没有比Pandas DataFrame更好的解决方案?
答案 0 :(得分:2)
如果我理解正确-您正在询问多层列:
In [116]: cols = pd.MultiIndex.from_tuples([('a',''),('b',''),('c',''),('z','1'),('z','2'),('z','3')])
In [117]: df = pd.DataFrame(X, columns=cols)
In [118]: df
Out[118]:
a b c z
1 2 3
0 0.537156 0.660093 0.327486 0.669400 0.677605 0.174052
1 0.787965 0.983033 0.615065 0.720758 0.853734 0.724249
2 0.587206 0.841086 0.781004 0.676756 0.177496 0.845777
3 0.174780 0.620644 0.338403 0.336302 0.508739 0.210462
4 0.132288 0.765768 0.009254 0.155105 0.548964 0.618722
5 0.980484 0.844023 0.779290 0.462613 0.562098 0.571654
6 0.082263 0.511944 0.003198 0.359354 0.531740 0.870077
7 0.805626 0.745733 0.251047 0.737418 0.532125 0.208116
8 0.906066 0.827050 0.434911 0.869463 0.089989 0.074839
9 0.146566 0.960262 0.957117 0.269052 0.086013 0.558531
结果:
In [119]: df.loc[:, 'a']
Out[119]:
0 0.537156
1 0.787965
2 0.587206
3 0.174780
4 0.132288
5 0.980484
6 0.082263
7 0.805626
8 0.906066
9 0.146566
Name: a, dtype: float64
In [120]: df.loc[:, 'z']
Out[120]:
1 2 3
0 0.669400 0.677605 0.174052
1 0.720758 0.853734 0.724249
2 0.676756 0.177496 0.845777
3 0.336302 0.508739 0.210462
4 0.155105 0.548964 0.618722
5 0.462613 0.562098 0.571654
6 0.359354 0.531740 0.870077
7 0.737418 0.532125 0.208116
8 0.869463 0.089989 0.074839
9 0.269052 0.086013 0.558531
In [121]: df.loc[:, ('z','2')]
Out[121]:
0 0.677605
1 0.853734
2 0.177496
3 0.508739
4 0.548964
5 0.562098
6 0.531740
7 0.532125
8 0.089989
9 0.086013
Name: (z, 2), dtype: float64