我正在尝试根据其他两个列的输入创建一个新的DataFrame列(列C)。我有两个条件,就是“ A列> 0”或“ B列包含字符串“ Apple” *”,那么C列的值应为“是”,否则应为“否” >
*如果答案不区分大小写,则奖励积分(也就是说,它将在“ Pineapple”和“ Apple Juice”中使用“ apple”
数据可能看起来像(以及C列应产生的结果)
Column_A Column_B Column_C
23 Orange Juice Yes
2 Banana Smoothie Yes
8 Pineapple Juice Yes
0 Pineapple Smoothie Yes
0 Apple Juice Yes
0 Lemonade No
34 Coconut Water Yes
我尝试了几件事,包括:
df['Keep6']= np.where((df['Column_A'] >0) | (df['Column_B'].find('Apple')>0) , 'Yes','No')
但是收到错误消息:"AttributeError: 'Series' object has no attribute 'find'"
答案 0 :(得分:1)
将Series.str.contains与case=False
配合使用,以不区分大小写:
df['Column_C']= np.where((df['Column_A']>0) | (df['Column_B'].str.contains('apple', case=False)) ,'Yes','No')
print(df)
Column_A Column_B Column_C
0 23 Orange_Juice Yes
1 2 Banana_Smoothie Yes
2 8 Pineapple_Juice Yes
3 0 Pineapple_Smoothie Yes
4 0 Apple_Juice Yes
5 0 Lemonade No
6 34 Coconut_Water Yes
答案 1 :(得分:0)
使用pandas.Dataframe.apply函数尝试以下代码:
df['Column_C'] = df.apply(lambda row: 'Yes' if (row['Column_A']>0) | (row['Column_B'].lower().find('apple')>=0) else 'No', axis=1)
并给出:
Column_A Column_B Column_C
0 23 Orange Juice Yes
1 2 Banana Smoothie Yes
2 8 Pineapple Juice Yes
3 0 Pineapple Smoothie Yes
4 0 Apple Juice Yes
5 0 Lemonade No
6 34 Coconut Water Yes