我有一个看起来像这样的数据框。
Group Task Status
1 A Success
1 B Success
1 C Success
2 A Success
2 B Success
2 C Failed
我想创建一个条件逻辑,按组对数据框进行分组。如果该组的所有状态为Success
,则Total_Status为Success
。如果该组的ANY状态为Failed
,则Total_Status为Failed
。
Group Task Status Overall_Status
1 A Success Success
1 B Success Success
1 C Success Success
2 A Success Failed
2 B Success Failed
2 C Failed Failed
答案 0 :(得分:0)
使用Using legacy database with django和GroupBy.transform
来测试是否所有值都是GroupBy.all
每个组的Success
并通过Series.eq
设置值:
mask = df['Status'].eq('Success').groupby(df['Group']).transform('all')
df['Overall_Status'] = np.where(mask, 'Success','Failed')
print (df)
Group Task Status Overall_Status
0 1 A Success Success
1 1 B Success Success
2 1 C Success Success
3 2 A Success Failed
4 2 B Success Failed
5 2 C Failed Failed
或者使用numpy.where
获取不等于Success
的组,并通过Series.ne
创建掩码:
mask1 = df['Group'].isin(df.loc[df['Status'].ne('Success'), 'Group'].unique())
另一种想法是,每组至少要用Series.isin
测试一个Failed
:
mask1 = df['Status'].eq('Failed').groupby(df['Group']).transform('any')
df['Overall_Status'] = np.where(mask1, 'Failed', 'Success')
print (df)
Group Task Status Overall_Status
0 1 A Success Success
1 1 B Success Success
2 1 C Success Success
3 2 A Success Failed
4 2 B Success Failed
5 2 C Failed Failed