sentence = 'Alice was not a bit hurt, and she jumped up on to her feet in a moment.'
words = ['Alice','jumped','played']
为了匹配sentence
中的words
,我使用了last post
[w for w in words if re.search(r'\b{}\b'.format(re.escape(w)), sentence)]
这会让我:
['Alice', 'jumped']
现在,如果words
列表以另一个序列(words = ['jumped','Alice','played']
)给出,我想在sentence
中显示匹配结果的顺序,即仍然需要:
['Alice', 'jumped']
而不是
['jumped','Alice']
我应该如何修改代码?
答案 0 :(得分:3)
一种方法是将句子作为基础,并过滤其他列表中的单词:
sentence_words = ['Alice','jumped','played']
words = ['jumped', 'Alice']
in_order = filter(set(words).__contains__, sentence_words)
# ['Alice', 'jumped']
或者:
word_set = set(words)
in_order = [word for word in sentence_words if word in word_set]
或者,您可以创建word->最后看到的索引的查找,并使用:
lookup = {word: idx for idx, word in enumerate(sentence_words)}
words.sort(key=lookup.__getitem__)
['Alice', 'jumped']
也许将两者结合起来:
new_words = sorted((word for word in words if word in lookup), key=lookup.get)
答案 1 :(得分:1)
你可以像这样构建你的模式:
pattern = r'\b(?:' + '|'.join(words) + r')\b'
并使用findall
re.findall(pattern, sentence)
删除重复项:
list(set(re.findall(pattern, sentence)))