匹配句子中两个连续的单词

时间:2019-05-30 20:30:18

标签: python

我有一堆句子,想弄清楚该句子是否包含一组特定的连续单词。例如,我有一个类似下面的列表

  list = ["Data Scientist",  "Data Analyst", "Data Engineer"]

并有如下句子

  Sentence1 = "I am first going to be a Data Analyst and then a Data Scientist"
  Sentence2 = "I only like to be a Data Engineer"

,并在所需的输出中为Sentence1选择“ Data Analyst”和“ Data Scientist”,为Sentence2选择“ Data Engineer”。

3 个答案:

答案 0 :(得分:4)

使用Yatu的示例数据。使用正则表达式,它肯定比in运算符

更快
import re

l = ["Data Scientist",  "Data Analyst", "Data Engineer"]
Sentence1 = "I am first going to be a Data Analyst and then a Data Scientist"

re.findall("|".join(l),Sentence1)

输出:

['Data Analyst', 'Data Scientist']

答案 1 :(得分:2)

您可以使用列表推导和in运算符来检查成员资格:

l = ["Data Scientist",  "Data Analyst", "Data Engineer"]
Sentence1 = "I am first going to be a Data Analyst and then a Data Scientist"

[i for i in l if i in Sentence1]
# ['Data Scientist', 'Data Analyst']

答案 2 :(得分:0)

为此使用正则表达式:

import re

lst = ["Data Scientist",  "Data Analyst", "Data Engineer"]

s = re.compile('|'.join(lst))

matches = re.findall(s, senetence)