Question

我在Pandas中有以下 df 数据框：

index_1    index_2    index_3
85         91         104
73         25         112
48         97         15
22         85         101

我想根据索引的值向前一个数据框添加一个名为 SEGMENT 的新列，如下所示：

if ((df['index_1'] > 90) & (df['index_2'] > 90) & (df['index_3'] > 90)) 
then **SEGMENT** should be **All**

if ((df['index_1'] > 90) & (df['index_2'] > 90))
then **SEGMENT** should be **Medium**

if ((df['index_2'] > 90) & (df['index_3'] > 90))
then **SEGMENT** should be **Medium high**

if ((df['index_2'] > 90))
then **SEGMENT** should be **Medium low**

if ((df['index_3'] > 90))
then **SEGMENT** should be **High**

if none of the indexes are greater than 90, put "None"

期望的结果是：

index_1    index_2    index_3    Segment
85         91         104        Medium high
73         25         112        High
48         97         15         None
22         85         101        High

如何在Python中使用Pandas实现这一目标？

我知道将每个条件作为一个单独的列放在一起很容易，但我需要在同一列中将所有这些条件放在一起。

提前致谢！

Answer 1

使用numpy.select：

m1 = df['index_1'] > 90
m2 = df['index_2'] > 90
m3 = df['index_3'] > 90

m = [m1 & m2 & m3, m1 & m2, m2 & m3, m2, m3]
v = ['All','Medium','Medium high','Medium low','High']

df['Segment'] = np.select(m, v, default=None)
print (df)
   index_1  index_2  index_3      Segment
0       85       91      104  Medium high
1       73       25      112         High
2       48       97       15   Medium low
3       22       85      101         High

用于在dataframe中创建新pandas列的多个ifs

1 个答案: