我有一个非常快的合并算法:它目前设置用于检查列表之间的未知单词;这个函数检查常用词,我需要更改下面的函数来检查单词是否在词汇或wds中,或者我都不能正确理解该功能,因此任何关于特定行的评论都会很好。
def find_unknowns_merge_pattern(vocab,wds):
result = []
xi = 0
yi = 0
while True:
if xi >= len(vocab):
result.extend(wds[yi:])
return result
if yi >= len(wds):
return result
if vocab[xi] == wds[yi]: # Good, word exists in vocab
yi += 1
elif vocab[xi] < wds[yi]: # Move past this vocab word,
xi += 1
else: # Got word that is not in vocab
result.append(wds[yi])
yi += 1
def check(greater_vocab,book_words):
both = [ ]
for words in bigger_vocab:
for people in book_words:
if words == people:
words.split()
both.append(words)
return both
问题是它需要至少5秒,而我的合并算法需要0.08。我怎样才能调用该功能,以便更快地完成这个功能?
答案 0 :(得分:0)
您是否只是想找到出现在greater_vocab和book_words中的单词?如果是这样的话:
both = list(set(bigger_vocab).intersection(book_words))