如何从给定的句子中提取更长的字符串而忽略子字符串?

时间:2019-11-15 07:04:32

标签: python-3.x string-matching

我有一个字符串列表和一个句子,如下所示:

list_of_strings=["skin allergy","hair loss","allergy","hair", "skin"]

sentence="She experienced skin allergy and hair loss after using it for 2-3 weeks"

我想将list_of_stringssentence匹配,并将输出仅打印为更长的短语(忽略子字符串):

skin allergy
hair loss

我写了这个code:但这会提取所有匹配的内容。

1 个答案:

答案 0 :(得分:1)

使用正则表达式。

例如:

import re

list_of_strings=["skin allergy","hair loss","allergy","hair", "skin"]
sentence="She experienced skin allergy and hair loss after using it for 2-3 weeks"
pattern = re.compile(r"(\b" + "|".join(list_of_strings) + r")\b")

m = pattern.findall(sentence)
print(m)

输出:

['skin allergy', 'hair loss']