挣扎着这个!我想基于多个列上的逻辑OR创建一个新的数据帧列。
数据框的格式为:
apples bananas oranges
0 bananas
1 apples
2
3 oranges
4
5 bananas oranges
(列中的空白是NaN)。我想创建一个新列,指示是否提及水果(无论提及多少次都没关系)。所以我最终得到了这个:
apples bananas oranges fruit
0 bananas fruit
1 apples fruit
2
3 oranges fruit
4
5 bananas oranges fruit
对我来说,它看起来像前三列的逻辑OR,但我无法弄清楚如何做到这一点。
答案 0 :(得分:1)
如果空值为NaN
,则可以使用notnull
与any
和loc
:
df.loc[df.notnull().any(1), 'new'] = 'fruit'
print (df)
apples bananas oranges new
0 NaN bananas NaN fruit
1 apples NaN NaN fruit
2 NaN NaN NaN NaN
3 NaN NaN oranges fruit
4 NaN NaN NaN NaN
5 NaN bananas oranges fruit
或者,如果空值为空,则使用带有更改掩码的numpy.where
:
df[ 'new'] = np.where((df != '').any(1), 'fruit', '')
print (df)
apples bananas oranges new
0 bananas fruit
1 apples fruit
2
3 oranges fruit
4
5 bananas oranges fruit