将一对列和结果作为枚举与新列进行比较,并将另一对列和需要结果作为枚举与同一新列进行比较
df如下所示:
a b c d length
18 32 76 75 8
64 63 76 64 9
55 84 98 45 0
72 92 87 65 0
76 83 23 56 0
36 87 97 12 11
如虚拟数据框中所示,我正在按顺序比较列
我的代码如下,
df['status_flag'] = np.where(df['b']>=df['a'], "Filtered out based on b>a", None)
df['status_flag'] = np.where(df['d']>=df['c'], "Filtered out based on b>a", None)
df['status_flag'] = np.where(df['e']==0, "Filtered out based on length", None)
此产量输出为:
a b c d length new
18 32 76 75 8
64 68 76 94 9
55 84 98 99 0 "Filtered out based on length"
72 92 87 65 0
76 83 23 56 0 "Filtered out based on length"
36 87 97 100 11
基本上,它将现有字符串替换为None。如何以其他方式做到这一点。
预期输出:
a b c d length new
18 32 76 75 8 "Filtered out based on b>a"
64 68 76 94 9 "Filtered out based on d>c"
55 84 98 99 0 "Filtered out based on length"
72 92 87 65 0 "Filtered out based on d>c"
76 83 23 56 0 "Filtered out based on length"
36 87 97 100 11 "Passed all filters"
答案 0 :(得分:2)
您可以通过以下操作完成此操作:
# Apply filters in the reverse order to get the sequence you want
df['new'] = 'Passed all filters'
df.loc[df.b > df.a, 'new'] = 'Filtered out based on b>a'
df.loc[df.d > df.c, 'new'] = 'Filtered out based on d>c'
df.loc[df.length == 0, 'new'] = 'Filtered out based on length'
print(df)
a b c d length new
0 18 32 76 75 8 Filtered out based on b>a
1 64 63 76 64 9 Passed all filters
2 55 84 98 45 0 Filtered out based on length
3 72 92 87 65 0 Filtered out based on length
4 76 83 23 56 0 Filtered out based on length
5 36 87 97 12 11 Filtered out based on b>a
注意:这使用给定的第一个数据帧,该数据帧与示例中使用的数据帧不同。使用该命令可获得以下结果:
a b c d length new
0 18 32 76 75 8 Filtered out based on b>a
1 64 68 76 94 9 Filtered out based on d>c
2 55 84 98 99 0 Filtered out based on length
3 72 92 87 65 0 Filtered out based on length
4 76 83 23 56 0 Filtered out based on length
5 36 87 97 100 11 Filtered out based on d>c