Question

假设我有这样一句话：

text = 'Romeo and Juliet is a tragedy written by William Shakespeare early in his career about two young star-crossed lovers whose deaths ultimately reconcile their feuding families'

和带有短语的列表：

phrases = ['Romeo and Juliet', 'William Shakespeare', 'career', 'lovers', 'deaths', 'feuding families']

是否可以从文本中排除这些短语以获得

result = ['is', 'a', 'tragedy', 'written', 'by', 'early', 'in', 'his', 'about', 'two', 'young', 'star-crossed', 'whose', 'ultimately', 'reconcile', 'their']

我以前使用过过滤器，但仅使用单个词而不是短语

Answer 1

您可以使用str replace将所有短语替换为空字符串，然后使用str split将所得的字符串沿withspace分开。

例如：

for phrase in phrases:
    text = text.replace(phrase, '')

result = text.split()

print(result)

Answer 2

您可以遍历短语，并使用python中的replace函数将其从字符串中删除。之后，您在空格处分割字符串，并应具有所需的输出。

欢迎使用Stackoverflow btw（;

for phrase in phrases:
    text = text.replace(phrase, '')

result = text.split(' ')
result.remove('')
print(result)

从文字中排除词组

2 个答案: