Question

我有一个非常快的合并算法：它目前设置用于检查列表之间的未知单词;这个函数检查常用词，我需要更改下面的函数来检查单词是否在词汇或wds中，或者我都不能正确理解该功能，因此任何关于特定行的评论都会很好。

def find_unknowns_merge_pattern（vocab，wds）：

result = []
xi = 0
yi = 0

while True:
    if xi >= len(vocab):
        result.extend(wds[yi:])
        return result

    if yi >= len(wds):
        return result

    if vocab[xi] == wds[yi]:  # Good, word exists in vocab
        yi += 1

    elif vocab[xi] < wds[yi]: # Move past this vocab word,
        xi += 1

    else:                     # Got word that is not in vocab
        result.append(wds[yi])
        yi += 1

def check（greater_vocab，book_words）：

    both = [ ]
        for words in bigger_vocab:
            for people in book_words:
                if words == people:
                    words.split()
                    both.append(words)

    return both

问题是它需要至少5秒，而我的合并算法需要0.08。我怎样才能调用该功能，以便更快地完成这个功能？

Answer 1

您是否只是想找到出现在greater_vocab和book_words中的单词？如果是这样的话：

both = list(set(bigger_vocab).intersection(book_words))

合并排序算法

1 个答案: