我有一堆句子,想弄清楚该句子是否包含一组特定的连续单词。例如,我有一个类似下面的列表
list = ["Data Scientist", "Data Analyst", "Data Engineer"]
并有如下句子
Sentence1 = "I am first going to be a Data Analyst and then a Data Scientist"
Sentence2 = "I only like to be a Data Engineer"
,并在所需的输出中为Sentence1选择“ Data Analyst”和“ Data Scientist”,为Sentence2选择“ Data Engineer”。
答案 0 :(得分:4)
使用Yatu的示例数据。使用正则表达式,它肯定比in
运算符
import re
l = ["Data Scientist", "Data Analyst", "Data Engineer"]
Sentence1 = "I am first going to be a Data Analyst and then a Data Scientist"
re.findall("|".join(l),Sentence1)
输出:
['Data Analyst', 'Data Scientist']
答案 1 :(得分:2)
您可以使用列表推导和in
运算符来检查成员资格:
l = ["Data Scientist", "Data Analyst", "Data Engineer"]
Sentence1 = "I am first going to be a Data Analyst and then a Data Scientist"
[i for i in l if i in Sentence1]
# ['Data Scientist', 'Data Analyst']
答案 2 :(得分:0)
为此使用正则表达式:
import re
lst = ["Data Scientist", "Data Analyst", "Data Engineer"]
s = re.compile('|'.join(lst))
matches = re.findall(s, senetence)