Question

我对如何标记为包含某些单词的True / False行存有疑问。

我有一个单词列表

my_list=['cat','dog','mouse']

和数据框中的4列：

Col1             Col2                  Col3             Col4
...  This is the story of a cat My dad is going to UK   False
...  My dog's name is Bert     The sky is so blue today False
... There is no one that understands me Why are you so sad? False

第一列到目前为止没有关系。最初根据某些初始条件设置第4列。但是，如果Col2和/或Col3包含我上面提到的列表中的单词之一，我想更改其值（False / True）。

预期输出为

Col1             Col2                  Col3             Col4
...  This is the story of a cat My dad is going to UK   True
...  The sky is so blue today   My dog's name is Bert   True
... There is no one that understands me Why are you so sad? False

，因为前两行包含至少一个单词（猫和狗）。我尝试使用sr.contains()：

pattern = '|'.join(my_list)
df['Col2','Col3'].str.contains(pattern)

，但不起作用。

我在做什么错了？

Answer 1

您需要apply在这里

pattern = '|'.join(my_list)
df[['Col2','Col3']].apply(lambda x : x.str.contains(pattern)).any(1)

或

(df['Col2']+df['Col3']).str.contains(pattern)

如果行中包含某些单词，则更新列的值

1 个答案: