我经常将数据帧特定级别的值作为我应该做的指南。在这种情况下,我正在使用pd.IndexSlice
切割数据帧并引用结果数据帧的索引。问题是结果数据帧的索引与原始索引相同。我需要它作为原始索引的一个子集,它尊重我所做的切片。
import pandas as pd
def produce_df(rows, columns, row_names=None, column_names=None):
"""rows is a list of lists that will be used to build a MultiIndex
columns is a list of lists that will be used to build a MultiIndex"""
row_index = pd.MultiIndex.from_product(rows, names=row_names)
col_index = pd.MultiIndex.from_product(columns, names=column_names)
return pd.DataFrame(index=row_index, columns=col_index)
df = produce_df([['a', 'b'], ['c', 'd']], [['1', '2'], ['3', '4']],
row_names=['alpha1', 'alpha2'], column_names=['number1', 'number2'])
print df
number1 1 2
number2 3 4 3 4
alpha1 alpha2
a c NaN NaN NaN NaN
d NaN NaN NaN NaN
b c NaN NaN NaN NaN
d NaN NaN NaN NaN
索引如下:
print df.index
MultiIndex(levels=[[u'a', u'b'], [u'c', u'd']],
labels=[[0, 0, 1, 1], [0, 1, 0, 1]],
names=[u'alpha1', u'alpha2'])
然后我切片:
islc = pd.IndexSlice[['a'], :]
df2 = df.loc[islc, :]
print df2
number1 1 2
number2 3 4 3 4
alpha1 alpha2
a c NaN NaN NaN NaN
d NaN NaN NaN NaN
这是预期的切片。索引是什么样的:
MultiIndex(levels=[[u'a', u'b'], [u'c', u'd']],
labels=[[0, 0], [0, 1]],
names=[u'alpha1', u'alpha2'])
df.index.levels[0]
仍然有'b'
。
MultiIndex
。答案 0 :(得分:0)
这有效,但很笨拙。我觉得这应该是一个我不看的地方。
df2.index = pd.MultiIndex.from_tuples(df2.index.to_series().values, names=df.index.names)
print df2
number1 1 2
number2 3 4 3 4
alpha1 alpha2
a c NaN NaN NaN NaN
d NaN NaN NaN NaN
print df2.index
MultiIndex(levels=[[u'a'], [u'c', u'd']],
labels=[[0, 0], [0, 1]],
names=[u'alpha1', u'alpha2'])
'b'
已离开df2.index.levels[0]