如何为多索引数据框的行建立索引
import pandas as pd
import numpy as np
np.random.seed(0)
tuples = list(zip(*[['bar', 'bar', 'baz', 'baz'],['one', 'two', 'one', 'two']]))
idx = pd.MultiIndex.from_tuples(tuples, names=['first', 'second'])
df = pd.DataFrame(np.random.randn(4, 2), index=idx, columns=['A', 'B'])
print(df)
A B
first second
bar one 1.764052 0.400157
two 0.978738 2.240893
baz one 1.867558 -0.977278
two 0.950088 -0.151357
使用第二个数据框的列
idxDf = pd.DataFrame({'first':['bar','baz'],'second':['one','two']})
print(idxDf)
first second
0 bar one
1 baz two
使得结果数据帧为
first second
bar one 1.764052 0.400157
baz two 0.950088 -0.151357
?
很显然,df[idxDf['first','second']]
不起作用。
答案 0 :(得分:2)
将DataFrame.merge
与DataFrame.reset_index
和DataFrame.set_index
结合使用:
print (df.reset_index().merge(idxDf, on=['first','second']).set_index(['first','second']))
A B
first second
bar one 1.764052 0.400157
baz two 0.950088 -0.151357
print (df.merge(idxDf,
left_index=True,
right_on=['first','second']).set_index(['first','second']))
A B
first second
bar one 1.764052 0.400157
baz two 0.950088 -0.151357
或在merge
之前的DataFrame.set_index
:
print (df.merge(idxDf.set_index(['first','second']),
left_index=True,
right_index=True))
A B
first second
bar one 1.764052 0.400157
baz two 0.950088 -0.151357