Question

我有一个这样的数据框。

status               new                    allocation          
asset                csh       fi        eq        csh   fi   eq
person act_type                                                 
p1     inv           0.0      0.0  100000.0        0.0  0.0  1.0
       rsp           0.0  30000.0   20000.0        0.0  0.6  0.4
       tfsa      10000.0  40000.0       0.0        0.2  0.8  0.0

右边的三列是每种act_type占总数的百分比。以下内容可以正确计算列：

# set the percent allocations
df.loc[idx[:,:],idx["allocation",'csh']] = df.loc[idx[:,:],idx["new",'csh']] / df.loc[idx[:,:],idx["new",:]].sum(axis=1)
df.loc[idx[:,:],idx["allocation",'fi']] = df.loc[idx[:,:],idx["new",'fi']] / df.loc[idx[:,:],idx["new",:]].sum(axis=1)
df.loc[idx[:,:],idx["allocation",'eq']] = df.loc[idx[:,:],idx["new",'eq']] / df.loc[idx[:,:],idx["new",:]].sum(axis=1)

我试图在一行中按如下方式结合“ csh”，“ fi”，“ eq”进行这些计算：

df.loc[idx[:,:],idx["new", ('csh', 'fi', 'eq')]] / df.loc[idx[:,:],idx["new",:]].sum(axis=1)

但这会导致ValueError：无法在未指定级别且名称没有重叠的情况下加入

有人建议我如何将这三行减少为一行代码，以便我将（“ csh”，“ fi”，“ eq”）除以帐户总数并在下一列中获取百分比？ >

Answer 1

首先应将idx[:,:]简化为:，然后将axis=0使用DataFrame.div，对于新列，将rename与DataFrame.join结合使用：< / p>

df1=df.loc[:, idx["new",('csh', 'fi', 'eq')]].div(df.loc[:, idx["new",:]].sum(axis=1),axis=0)
df = df.join(df1.rename(columns={'new':'allocation'}, level=0))
print (df)
status               new                    allocation          
asset                csh       fi        eq        csh   fi   eq
person act_type                                                 
p1     inv           0.0      0.0  100000.0        0.0  0.0  1.0
       rsp           0.0  30000.0   20000.0        0.0  0.6  0.4
       tfsa      10000.0  40000.0       0.0        0.2  0.8  0.0

对新列进行多索引计算

1 个答案: