我有一个这样的概率表:
BC_array =[np.array(['B=n','B=m','B=s','B=n','B=m','B=s']),np.array(['C=F', 'C=F', 'C=F', 'C=T', 'C=T', 'C=T'])]
pD_BC_array=np.array([[0.9,0.8,0.1,0.3,0.4,0.01],[0.08,0.17,0.01,0.05,0.05,0.01],[0.01,0.01,0.87,0.05,0.15,0.97],[0.01,0.02,0.02,0.6,0.4,0.01]])
pD_BC=pd.DataFrame(pD_BC_array,index=['D=h','D=c','D=s','D=r'],columns=BC_array)
B=n B=m B=s B=n B=m B=s
C=F C=F C=F C=T C=T C=T
D=h 0.90 0.80 0.10 0.30 0.40 0.01
D=c 0.08 0.17 0.01 0.05 0.05 0.01
D=s 0.01 0.01 0.87 0.05 0.15 0.97
D=r 0.01 0.02 0.02 0.60 0.40 0.01
我怎样才能边缘化C'(总结所有' C = F'和' C = T'一起)并得到表格:
B=n B=m B=s
D=h 1.20 1.20 0.11
D=c 0.13 0.22 0.02
D=s 0.06 0.16 1.84
D=r 0.61 0.42 0.03
像这样?
答案 0 :(得分:1)
您可以在df上调用sum
并传递参数axis=1
以获取行和level=0
以及该级别的总和:
In [259]:
pD_BC.sum(axis=1, level=0)
Out[259]:
B=m B=n B=s
D=h 1.20 1.20 0.11
D=c 0.22 0.13 0.02
D=s 0.16 0.06 1.84
D=r 0.42 0.61 0.03