免责声明:我是一名没有接受过正式培训的医生,他的任务是创建一份报告,以查看有多少患者达到了治疗目标,因此我需要按机构和诊所来分类。我可以得到分母(按设施和诊所划分的总病例数)和分子(达到治疗目标的总病例数),但是我不知道如何在分组依据中同时显示这两个数字以及一列显示达到目标百分比的列( num / denom)。
示例数据框:
import pandas as pd
df = pd.DataFrame([[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15],
['anxiety','PTSD','PTSD','anxiety','PTSD','depression','anxiety','anxiety','PTSD','anxiety','anxiety','anxiety','depression','depression','PTSD'],
[False,False,False,True,True,False,True,False,False,False,False,False,False,False,False],
['120C','120C','120C','120C','120C','120C','120C','120C','120C','120C','375C','375C','375C','375C','375C'],
['BH-PSYL','BH-PSYL','BH-YUKON','BH-DENALI','BH-YUKON','BH-DENALI','BH-CFS','BH-CFS','BH-CFS','BH-CFS','BH-HTHPSY','BH-HTHPSY','BH-BSS','BH-HTHPSY','BH-BSS']]).T
df.columns = ['Patient ID','DX Category','Met Goal','Facility','Clinic']
这给出了分母:
df.groupby(['Facility', 'Clinic']).count()[['Met Goal']]
这给出了分子:
df[df['Met Goal'] == True].groupby(['Facility', 'Clinic']).count()[['Met Goal']]
最终结果应显示(虚数):
Facility | Clinic | Met Goal | Cases | Percent
120C
| BH-PSYL | 1 | 4 | 25%
| BH-YUKON | 2 | 6 | 33%
375C
| BH-CFS | 0 | 1 | 0%
感谢您的帮助!
答案 0 :(得分:3)
您已经完成了大部分工作,但是要获得最终输出,您可以执行以下操作:
#first calculate at once the numerator and denominator in the same dataframe
df_final = df.groupby(['Facility', 'Clinic']).agg({'Met Goal':['sum', 'count']})
#then change the name of the columns
df_final.columns = ['Met Goal','Cases']
#finally calaulate the percent
df_final['Percent'] = df_final['Met Goal']/df_final['Cases']
您会得到:
print (df_final)
Met Goal Cases Percent
Facility Clinic
120C BH-CFS 1 4 0.25
BH-DENALI 1 2 0.50
BH-PSYL 0 2 0.00
BH-YUKON 1 2 0.50
375C BH-BSS 0 2 0.00
BH-HTHPSY 0 3 0.00