合并排序算法

时间:2016-09-11 12:06:30

标签: python

我有一个非常快的合并算法:它目前设置用于检查列表之间的未知单词;这个函数检查常用词,我需要更改下面的函数来检查单词是否在词汇或wds中,或者我都不能正确理解该功能,因此任何关于特定行的评论都会很好。

def find_unknowns_merge_pattern(vocab,wds):

result = []
xi = 0
yi = 0

while True:
    if xi >= len(vocab):
        result.extend(wds[yi:])
        return result

    if yi >= len(wds):
        return result

    if vocab[xi] == wds[yi]:  # Good, word exists in vocab
        yi += 1

    elif vocab[xi] < wds[yi]: # Move past this vocab word,
        xi += 1

    else:                     # Got word that is not in vocab
        result.append(wds[yi])
        yi += 1

def check(greater_vocab,book_words):

    both = [ ]
        for words in bigger_vocab:
            for people in book_words:
                if words == people:
                    words.split()
                    both.append(words)

    return both

问题是它需要至少5秒,而我的合并算法需要0.08。我怎样才能调用该功能,以便更快地完成这个功能?

1 个答案:

答案 0 :(得分:0)

您是否只是想找到出现在greater_vocab和book_words中的单词?如果是这样的话:

both = list(set(bigger_vocab).intersection(book_words))