给出以下数据框和数据透视表:
import pandas as pd
df=pd.DataFrame({'A':['x','y','z','x','y','z'],
'B':['one','one','one','two','two','two'],
'C':[2,18,2,8,2,18]})
df
A B C
0 x one 2
1 y one 18
2 z one 2
3 x two 8
4 y two 2
5 z two 18
table = pd.pivot_table(df, index=['A', 'B'],aggfunc=np.sum)
C
A B
x one 2
two 8
y one 18
two 2
z one 2
two 18
我想在此数据透视表中添加2列;一个显示所有值的百分比,另一个显示A列中的百分比,如下所示:
C % of Total % of B
A B
x one 2 4% 10%
two 18 36% 90%
y one 2 4% 20%
two 8 16% 80%
z one 2 4% 10%
two 18 36% 90%
额外信用:
我想要一个底部摘要行,其中包含C列的总和(如果接下来的2列也有100%,那就没问题了,但这些都没有。)
答案 0 :(得分:18)
您可以使用:
table['% of Total'] = (table.C / table.C.sum() * 100).astype(str) + '%'
table['% of B'] = (table.C / table.groupby(level=0).C.transform(sum) * 100).astype(str) + '%'
print table
C % of Total % of B
A B
x one 2 4.0% 20.0%
two 8 16.0% 80.0%
y one 18 36.0% 90.0%
two 2 4.0% 10.0%
z one 2 4.0% 10.0%
two 18 36.0% 90.0%
但是对于真实数据,我认为不推荐使用int
,最好使用round
。
额外信用:
table['% of Total'] = (table.C / table.C.sum() * 100)
table['% of B'] = (table.C / table.groupby(level=0).C.transform(sum) * 100)
table.loc['total', :] = table.sum().values
print table
C % of Total % of B
A B
x one 2.0 4.0 20.0
two 8.0 16.0 80.0
y one 18.0 36.0 90.0
two 2.0 4.0 10.0
z one 2.0 4.0 10.0
two 18.0 36.0 90.0
total 50.0 100.0 300.0