其他列值熊猫的分组和计数

时间:2019-11-15 10:04:49

标签: python pandas

我有一个熊猫数据框

age   gender   criticality    acknowledged       
 10    Male       High            Yes
 10    Male       High            Yes
 10    Male       High            Yes
 10    Male       Low             Yes
 11    Female     Medium          No

我想按年龄和性别分组,然后将“临界”,“确认”的值作为新列并获得计数。

例如,我想要的输出是:

                 criticality          acknowledged
age  gender    High   Medium   Low     Yes    No
 10    Male    3       0       1        4     0
 11    Female  0       1       0        0     1 

我考虑过使用df.groupby(['age','gender'])['criticality','acknowledged'].stack()

但是它不起作用。

是否有更好的方法来获取这种格式的结果

2 个答案:

答案 0 :(得分:1)

由于您要分别计算这两列,因此concat是一个简单的解决方案:

In [13]: pd.concat([df.pivot_table(index=['age', 'gender'], columns=col, aggfunc
    ...: =len) for col in ['criticality', 'acknowledged']], axis=1).fillna(0)
Out[13]: 
            acknowledged             criticality     
criticality         High  Low Medium          No  Yes
age gender                                           
10  Male             3.0  1.0    0.0         0.0  4.0
11  Female           0.0  0.0    1.0         1.0  0.0

答案 1 :(得分:1)

get_dummies()之后将assigninggroupby()一起使用的另一种方法,最后用expand=True拆分多索引的列:

l=['criticality','acknowledged']
final=df[['age','gender']].assign(**pd.get_dummies(df[l])).groupby(['age','gender']).sum()
final.columns=final.columns.str.split('_',expand=True)
print(final)

                     criticality       acknowledged    
                   High Low Medium           No Yes
age gender                                        
10  Male             3   1      0            0   4
11  Female           0   0      1            1   0