我在Pandas中有以下 df 数据框:
index_1 index_2 index_3
85 91 104
73 25 112
48 97 15
22 85 101
我想根据索引的值向前一个数据框添加一个名为 SEGMENT 的新列,如下所示:
if ((df['index_1'] > 90) & (df['index_2'] > 90) & (df['index_3'] > 90))
then **SEGMENT** should be **All**
if ((df['index_1'] > 90) & (df['index_2'] > 90))
then **SEGMENT** should be **Medium**
if ((df['index_2'] > 90) & (df['index_3'] > 90))
then **SEGMENT** should be **Medium high**
if ((df['index_2'] > 90))
then **SEGMENT** should be **Medium low**
if ((df['index_3'] > 90))
then **SEGMENT** should be **High**
if none of the indexes are greater than 90, put "None"
期望的结果是:
index_1 index_2 index_3 Segment
85 91 104 Medium high
73 25 112 High
48 97 15 None
22 85 101 High
如何在Python中使用Pandas实现这一目标?
我知道将每个条件作为一个单独的列放在一起很容易,但我需要在同一列中将所有这些条件放在一起。
提前致谢!
答案 0 :(得分:3)
使用numpy.select
:
m1 = df['index_1'] > 90
m2 = df['index_2'] > 90
m3 = df['index_3'] > 90
m = [m1 & m2 & m3, m1 & m2, m2 & m3, m2, m3]
v = ['All','Medium','Medium high','Medium low','High']
df['Segment'] = np.select(m, v, default=None)
print (df)
index_1 index_2 index_3 Segment
0 85 91 104 Medium high
1 73 25 112 High
2 48 97 15 Medium low
3 22 85 101 High