根据其他多个列的条件创建新的Python DataFrame列

时间:2019-10-08 16:19:04

标签: python pandas

我正在尝试根据其他两个列的输入创建一个新的DataFrame列(列C)。我有两个条件,就是“ A列> 0”或“ B列包含字符串“ Apple” *”,那么C列的值应为“是”,否则应为“否”

*如果答案不区分大小写,则奖励积分(也就是说,它将在“ Pineapple”和“ Apple Juice”中使用“ apple”

数据可能看起来像(以及C列应产生的结果)

Column_A Column_B           Column_C  
23       Orange Juice       Yes  
2        Banana Smoothie    Yes  
8        Pineapple Juice    Yes  
0        Pineapple Smoothie Yes  
0        Apple Juice        Yes  
0        Lemonade           No  
34       Coconut Water      Yes

我尝试了几件事,包括:

df['Keep6']= np.where((df['Column_A'] >0) | (df['Column_B'].find('Apple')>0) , 'Yes','No')

但是收到错误消息:"AttributeError: 'Series' object has no attribute 'find'"

2 个答案:

答案 0 :(得分:1)

Series.str.containscase=False配合使用,以不区分大小写

df['Column_C']= np.where((df['Column_A']>0) | (df['Column_B'].str.contains('apple', case=False)) ,'Yes','No')
print(df)

   Column_A            Column_B Column_C
0        23        Orange_Juice      Yes
1         2     Banana_Smoothie      Yes
2         8     Pineapple_Juice      Yes
3         0  Pineapple_Smoothie      Yes
4         0         Apple_Juice      Yes
5         0            Lemonade       No
6        34       Coconut_Water      Yes

答案 1 :(得分:0)

使用pandas.Dataframe.apply函数尝试以下代码:

df['Column_C'] = df.apply(lambda row: 'Yes' if (row['Column_A']>0) | (row['Column_B'].lower().find('apple')>=0) else 'No', axis=1)

并给出:

   Column_A            Column_B Column_C
0        23        Orange Juice      Yes
1         2     Banana Smoothie      Yes
2         8     Pineapple Juice      Yes
3         0  Pineapple Smoothie      Yes
4         0         Apple Juice      Yes
5         0            Lemonade       No
6        34       Coconut Water      Yes