Question

我正在编写一个拼写检查功能，我正在使用两个文本文件：一个带有拼写错误的文本，一个带有一堆字词的文本文件。我已将拼写错误的单词的文本转换为字符串列表，将带有字典单词的文本文件转换为单词列表。现在我需要查看拼写错误列表中的单词是否在我的字典单词列表中。

def spellCheck():
    checkFile=input('Enter file name: ')
    inFile=open(checkFile,'r')

# This separates my original text file into a list like this
# [['It','was','the','besst','of','times,'],
# ['it','was','teh','worst','of','times']]

    separate=[]
    for line in inFile:
        separate.append(line.split())

# This opens my list of words from the dictionary and 
# turns it into a list of the words.

    wordFile=open('words.txt','r')
    words=wordFile.read()
    wordList=(list(words.split()))
    wordFile.close()


# I need this newList to be a list of the correctly spelled words 
# in my separate[] list and if the word isn't spelled correctly 
# it will go into another if statement... 

    newList=[]
    for word in separate:
        if word in wordList:
            newList.append(word)
    return newList

Answer 1

试试这个：

newList = []
for line in separate:
    for word in line:
        if word in wordList:
            newList.append(word)
return newList

您遇到的问题是您正在迭代separate，这是一个列表列表。 wordList中没有任何列表，这就是if语句总是失败的原因。您要迭代的单词位于separate中包含的子列表中。因此，您可以在第二个for循环中迭代这些单词。您也可以使用for word in itertools.chain.from_iterable(separate)。

希望这有帮助

Answer 2

首先，关于数据结构的一个词。您应该使用list s而不是set s，因为您（显然）只需要每个单词的副本。您可以从列表中创建集：

input_words = set(word for line in separate for word in line) # since it is a list of lists
correct_words = set(word_list)

然后，它很简单：

new_list = input_words.intersection(correct_words)

如果你想要不正确的单词，你还有另一个单词：

incorrect = input_words.difference(correct_words)

请注意，我在PEP 8中推荐使用names_with_underscores而不是CamelCase。但请记住，这对于拼写检查来说效率不高，因为您不检查上下文。

Python检查列表中的单词

2 个答案: