Question

我正在使用Program Arcade Games学习Python，而且我已经陷入其中一个实验室。

我应该比较一个文本文件（http://programarcadegames.com/python_examples/en/AliceInWonderLand200.txt）的每个单词，以查找它是否不在字典文件（http://programarcadegames.com/python_examples/en/dictionary.txt）中，如果不是，则将其打印出来。我应该使用线性搜索。

问题是我知道不在字典文件中的单词都没有打印出来。任何帮助将不胜感激。

我的代码如下：

# Imports regular expressions
import re

# This function takes a line of text and returns
# a list of words in the line


def split_line(line):
    split = re.findall('[A-Za-z]+(?:\'\"[A-Za-z]+)?', line)
    return split


# Opens the dictionary text file and adds each line to an array, then closes the file
dictionary = open("dictionary.txt")
dict_array = []
for item in dictionary:
    dict_array.append(split_line(item))
print(dict_array)
dictionary.close()

print("---Linear Search---")

# Opens the text for the first chapter of Alice in Wonderland
chapter_1 = open("AliceInWonderland200.txt")

# Breaks down the text by line
for each_line in chapter_1:
    # Breaks down each line to a single word
    words = split_line(each_line)
    # Checks each word against the dictionary array
    for each_word in words:
        i = 0
        # Continues as long as there are more words in the dictionary and no match
        while i < len(dict_array) and each_word.upper() != dict_array[i]:
            i += 1
        # if no match was found print the word being checked
        if not i <= len(dict_array):
            print(each_word)

# Closes the first chapter file
chapter_1.close()

Answer 1

在Python中查找拼写错误的线性搜索

这样的事情应该做（伪代码）

sampleDict = {}
For each word in AliceInWonderLand200.txt:
    sampleDict[word] = True

actualWords = {}
For each word in dictionary.txt:
    actualWords[word] = True

For each word in sampleDict:
    if not (word in actualDict):
        # Oh no!  word isn't in the dictionary

set可能比dict更合适，因为样本中字典的值并不重要。这应该可以帮助您，但

线性搜索在Python中查找拼写错误

1 个答案: