Question

我的列上有一个3级深度多索引的数据帧。我想计算跨行（sum(axis=1)）的小计，其中我在其中一个级别上求和，同时保留其他级别。我想我知道如何使用level的{{1}}关键字参数执行此操作。但是，我很难想到如何将这笔钱的结果合并到原始表格中。

设定：

pd.DataFrame.sum

给出一个漂亮的框架：

Original frame

说我想降低import numpy as np import pandas as pd from itertools import product np.random.seed(0) colors = ['red', 'green'] shapes = ['square', 'circle'] obsnum = range(5) rows = list(product(colors, shapes, obsnum)) idx = pd.MultiIndex.from_tuples(rows) idx.names = ['color', 'shape', 'obsnum'] df = pd.DataFrame({'attr1': np.random.randn(len(rows)), 'attr2': 100 * np.random.randn(len(rows))}, index=idx) df.columns.names = ['attribute'] df = df.unstack(['color', 'shape'])级别。我可以跑：

shape

得到我的总数：

totals

有了这个，我想把它放到原来的框架上。我想我可以用一种有点麻烦的方式做到这一点：

tots = df.sum(axis=1, level=['attribute', 'color'])

aggregated

有更自然的方法吗？

Answer 1

这是一种没有循环的方法：

s = df.sum(axis=1, level=[0,1]).T
s["shape"] = "sum(shape)"
s.set_index("shape", append=True, inplace=True)
df.combine_first(s.T)

诀窍是使用转置的总和。因此，我们可以插入另一个列（即行），其中包含附加级别的名称，我们的名称与我们总结的名称完全相同。可以使用set_index将此列转换为索引中的级别。然后我们将df与转置的总和相结合。如果总和级别不是最后一级，则可能需要一些级别重新排序。

Answer 2

这是我的蛮力方式。

运行完好的（谢谢）示例代码后，我做了这个：

attributes = pd.unique(df.columns.get_level_values('attribute'))
colors = pd.unique(df.columns.get_level_values('color'))

for attr in attributes:
    for clr in colors:
        df[(attr, clr, 'sum')] = df.xs([attr, clr], level=['attribute', 'color'], axis=1).sum(axis=1)

df

这给了我：

big table

使用多索引在pandas中添加小计列

2 个答案: