我想从多级数据框中删除行索引的列表,其中前两级具有重复的条目。我想可以无循环地执行此操作,但是到目前为止,我还没有发现这一点。
我试图通过提供列表行索引组合来使用pd.drop函数,尽管这样做没有理想的效果。例如:
import numpy as np
import pandas as pd
def mklbl(prefix, n):
return ["%s%s" % (prefix, i) for i in range(n)]
def src_rec(n, mult):
src = [[no]*mult for no in range(1,n)]
src = [item for sublist in src for item in sublist]
rec = [no for no in range(1,n)]*mult
return src, rec
src, rec = src_rec(4,4)
miindex = pd.MultiIndex.from_arrays([src*2,
rec*2,
mklbl('C', 24)])
dfmi = pd.DataFrame(np.arange(len(miindex) * 2)\
.reshape((len(miindex), 2)),
index=miindex)
As = [1, 2]
Bs = [2, 3]
dfmi.drop(pd.MultiIndex.from_arrays([As,Bs]))
其结果是:
0 1
1 1 C0 0 1
2 1 C18 36 37
2 C19 38 39
3 3 C20 40 41
1 C21 42 43
2 C22 44 45
3 C23 46 47
我想要的结果是:
0 1
1 1 C0 0 1
3 C2 4 5
1 C3 6 7
2 2 C4 8 9
1 C6 12 13
2 C7 14 15
3 3 C8 16 17
1 C9 18 19
2 C10 20 21
3 C11 22 23
1 1 C12 24 25
3 C14 28 29
1 C15 30 31
2 2 C16 32 33
1 C18 36 37
2 C19 38 39
3 3 C20 40 41
1 C21 42 43
2 C22 44 45
3 C23 46 47
一个循环的例子是
for A, B in zip(As, Bs):
dfmi_drop_idx = CCdata.loc[(A, B, slice(None)), :].index
dfmi.drop(dfmi_drop_idx, inplace=True, errors='raise')
答案 0 :(得分:1)
将boolean indexing
用于Index.isin
的测试成员资格:
m = pd.MultiIndex.from_arrays([As,Bs])
df = dfmi[~dfmi.reset_index(level=2, drop=True).index.isin(m)]
print (df)
0 1
1 1 C0 0 1
3 C2 4 5
1 C3 6 7
2 2 C4 8 9
1 C6 12 13
2 C7 14 15
3 3 C8 16 17
1 C9 18 19
2 C10 20 21
3 C11 22 23
1 1 C12 24 25
3 C14 28 29
1 C15 30 31
2 2 C16 32 33
1 C18 36 37
2 C19 38 39
3 3 C20 40 41
1 C21 42 43
2 C22 44 45
3 C23 46 47