Question

假设我有一些文字，例如：

text = 'Ophelia is a character in William Shakespeare's drama Hamlet. She is a young noblewoman of Denmark, the daughter of Polonius, sister of Laertes, and potential wife of Prince Hamlet.'

和False值的并行列表

wantedWords = [False]*len(text.split())

以及一系列短语和单词，例如：

phrases = ['Ophelia', 'Hamlet', 'daughter of Polonius', 'Prince Hamlet']

我希望对于在文本中找到的词组数组的每个实例，将wantWords设置为True。

因此WantedWords列表变为：

wanted Words = [True, False, False, False, False, False, False, False, True, False, False, False, False, False, False, False, False, True, True, True, False, False, False, False, False, False, False, True, True]

Answer 1

这可能有帮助。

text = "Ophelia is a character in William Shakespeare's drama Hamlet. She is a young noblewoman of Denmark, the daughter of Polonius, sister of Laertes, and potential wife of Prince Hamlet."
wantedWords = []
phrases = ['Ophelia', 'Hamlet', 'daughter of Polonius', 'Prince Hamlet']

for i in sorted(phrases, key=lambda x: len(x), reverse=True):    #Sorting the phrases list by len of elements. 
    if i in text:
        text = text.replace(i, "*"*len(i.split()))     #Replaceing found phase with *

for i in text.split():
    if "*" in i:
        for k in range(i.count("*")):
            wantedWords.append(True)
    else:
        wantedWords.append(False)

print(wantedWords)

并行列表以注释短语

1 个答案: