如何使用groupby创建条件列?

时间:2019-08-07 12:37:12

标签: python pandas group-by conditional-statements

我有一个看起来像这样的数据框。

Group   Task    Status   
1       A       Success
1       B       Success
1       C       Success
2       A       Success
2       B       Success
2       C       Failed

我想创建一个条件逻辑,按组对数据框进行分组。如果该组的所有状态为Success,则Total_Status为Success。如果该组的ANY状态为Failed,则Total_Status为Failed

Group   Task    Status         Overall_Status
1       A       Success        Success
1       B       Success        Success
1       C       Success        Success
2       A       Success        Failed
2       B       Success        Failed
2       C       Failed         Failed

1 个答案:

答案 0 :(得分:0)

使用Using legacy database with djangoGroupBy.transform来测试是否所有值都是GroupBy.all每个组的Success并通过Series.eq设置值:

mask = df['Status'].eq('Success').groupby(df['Group']).transform('all')
df['Overall_Status'] = np.where(mask, 'Success','Failed')
print (df)
   Group Task   Status Overall_Status
0      1    A  Success        Success
1      1    B  Success        Success
2      1    C  Success        Success
3      2    A  Success         Failed
4      2    B  Success         Failed
5      2    C   Failed         Failed

或者使用numpy.where获取不等于Success的组,并通过Series.ne创建掩码:

mask1 = df['Group'].isin(df.loc[df['Status'].ne('Success'), 'Group'].unique())

另一种想法是,每组至少要用Series.isin测试一个Failed

mask1 = df['Status'].eq('Failed').groupby(df['Group']).transform('any')
df['Overall_Status'] = np.where(mask1, 'Failed', 'Success')

print (df)
   Group Task   Status Overall_Status
0      1    A  Success        Success
1      1    B  Success        Success
2      1    C  Success        Success
3      2    A  Success         Failed
4      2    B  Success         Failed
5      2    C   Failed         Failed