我有一个带有 3 级多索引的数据框:
>>> np.random.seed(0)
>>> df = pd.DataFrame(np.random.randint(10, size=(18, 2)),
index=pd.MultiIndex.from_product([[True, False],
['yes', 'no', 'maybe'],
['one', 'two', 'three']],
names=['bool', 'ans', 'count']),
columns=['A', 'B'])
>>> df
A B
bool ans count
True yes one 5 0
two 3 3
three 7 9
no one 3 5
two 2 4
three 7 6
maybe one 8 8
two 1 6
three 7 7
False yes one 8 1
two 5 9
three 8 9
no one 4 3
two 0 3
three 5 0
maybe one 2 3
two 8 1
three 3 3
我的目标是从具有相同 maybe
和 bool
的所有其他值中减去 count
值。减数是
>>> sub = df.loc[(slice(None), 'maybe', slice(None)), :]
>>> sub
A B
bool ans count
True maybe one 8 8
two 1 6
three 7 7
False maybe one 2 3
two 8 1
three 3 3
问题是,当我尝试从其他项目中减去它时,索引与预期不匹配:
>>> df - sub
A B
bool ans count
False maybe one 0.0 0.0
three 0.0 0.0
two 0.0 0.0
no one NaN NaN
three NaN NaN
two NaN NaN
yes one NaN NaN
three NaN NaN
two NaN NaN
True maybe one 0.0 0.0
three 0.0 0.0
two 0.0 0.0
no one NaN NaN
three NaN NaN
two NaN NaN
yes one NaN NaN
three NaN NaN
two NaN NaN
我想要的结果是
A B
bool ans count
True yes one -3 -8
two 2 -3
three 0 2
True no one -5 -3
two 1 -2
three 0 -1
True maybe one 0 0
two 0 0
three 0 0
False yes one 6 -2
two -3 8
three 5 6
False no one 2 0
two -8 2
three 2 -3
False maybe one 0 0
two 0 0
three 0 0
如何告诉熊猫遵守 bool
和 count
级别但忽略 ans
级别?
答案 0 :(得分:2)
横截面 xs
在这里可能是一个不错的选择:
df.sub(
df.xs('maybe', level=1)
).swaplevel().reindex(df.index)
输出:
A B
bool ans count
True yes one -3 -8
two 2 -3
three 0 2
no one -5 -3
two 1 -2
three 0 -1
maybe one 0 0
two 0 0
three 0 0
False yes one 6 -2
two -3 8
three 5 6
no one 2 0
two -8 2
three 2 -3
maybe one 0 0
two 0 0
three 0 0