多索引的Pandas语法并不那么容易发现。 就我而言,给定这个数据集:
header = pd.MultiIndex.from_product([['topic1'],
['location1','location2'],
['S1','S2','S3']],
names=['top', 'loc','S'])
df = pd.DataFrame(np.random.randn(5, 6),
index=['a','b','c','d','e'],
columns=header)
df
产生:
top topic1
loc location1 location2
S S1 S2 S3 S1 S2 S3
a -0.235613 1.064278 -2.147621 0.825380 -0.443313 -1.064031
b 0.404703 0.830838 -0.294387 -1.438028 0.836324 -2.427235
c 0.486648 -0.091448 1.246530 -0.005375 0.159478 -0.103404
d -0.638070 -1.057061 0.596882 -1.007059 -0.654583 -0.618137
e -0.850887 -1.660056 0.129954 1.204890 -1.457207 0.678393
我想:
答案 0 :(得分:1)
您可以使用参数keys
的{{3}}来定义MultiIndex
的最高级别,使用concat
的列的更改names
或分配值:
df = (pd.concat([df['topic1'], df['topic1'] * 2], keys=('topic1','topic2'), axis=1)
.rename_axis(('aaa','bbb', df.columns.names[2]), axis=1))
替代:
df = pd.concat([df['topic1'], df['topic1'] * 2], keys=('topic1','topic2'), axis=1)
df.columns.names = ('aaa','bbb', df.columns.names[2])
print (df)
aaa topic1 topic2 \
bbb location1 location2 location1
S S1 S2 S3 S1 S2 S3 S1
a 0.511604 -0.217660 -0.521060 1.253270 1.104554 -0.770309 1.023207
b 0.632975 -1.322322 -0.936332 0.436361 1.233744 0.527565 1.265951
c -0.369576 1.820059 -1.373630 -0.414554 -0.098443 0.904791 -0.739151
d 1.656726 -0.972017 -0.300689 -0.179819 0.472515 2.379975 3.313453
e -0.053210 -0.180697 0.176240 -1.087404 -1.012181 -0.049870 -0.106421
aaa
bbb location2
S S2 S3 S1 S2 S3
a -0.435320 -1.042120 2.506541 2.209108 -1.540617
b -2.644644 -1.872664 0.872723 2.467488 1.055129
c 3.640118 -2.747261 -0.829108 -0.196885 1.809582
d -1.944034 -0.601379 -0.359638 0.945030 4.759950
e -0.361395 0.352480 -2.174809 -2.024362 -0.099739