Question

我想在Pandas多索引数据框中添加列，其中将包含对其他列执行的操作的结果。

我有一个与此类似的数据框：

first   bar     baz     
second  one two one two
A       5   2   9   2   
B       6   4   7   6   
C       5   4   5   1

现在，对于数据框中的每个组，我想添加一列“三”，该列等于“一”列减去“两”列：

first   bar             baz     
second  one two three   one two three
A       5   2   3       9   2   7
B       6   4   2       7   6   1
C       5   4   1       5   1   4

实际上，我的数据框更大。我正在努力寻找这个（希望）简单问题的答案。任何建议表示赞赏。

Answer 1

使用MultiIndex

创建您的附加df

s=pd.DataFrame([[1,2],[2,3],[3,4]],columns=pd.MultiIndex.from_arrays([['bar','baz'],['three','three']]))
s
Out[458]: 
    bar   baz
  three three
0     1     2
1     2     3
2     3     4

然后我们做concat

yourdf=pd.concat([df,s],axis=1).sort_index(level=0,axis=1)

如果顺序很重要，则可以reindex或考虑将级别分解。

Answer 2

使用DataFrame.xs来选择one和two级别并相减，然后在MultiIndex.from_product的列中创建MultiIndex：

df1 = df.xs('one', axis=1, level=1) - df.xs('two', axis=1, level=1)
df1.columns = pd.MultiIndex.from_product([df1.columns, ['three']])
print (df1)
    bar   baz
  three three
A     3     7
B     2     1
C     1     4

然后concat为原始文件，并通过助手MultiIndex使用reindex进行更改顺序：

mux = pd.MultiIndex.from_product([['bar','baz'], ['one','two','three']], 
                                  names=df.columns.names)
df = pd.concat([df, df1], axis=1).reindex(columns=mux)
print (df)
first  bar           baz          
second one two three one two three
A        5   2     3   9   2     7
B        6   4     2   7   6     1
C        5   4     1   5   1     4

多索引数据帧上的操作

2 个答案: