我正在使用Program Arcade Games学习Python,而且我已经陷入其中一个实验室。
我应该比较一个文本文件(http://programarcadegames.com/python_examples/en/AliceInWonderLand200.txt)的每个单词,以查找它是否不在字典文件(http://programarcadegames.com/python_examples/en/dictionary.txt)中,如果不是,则将其打印出来。我应该使用线性搜索。
问题是我知道不在字典文件中的单词都没有打印出来。任何帮助将不胜感激。
我的代码如下:
# Imports regular expressions
import re
# This function takes a line of text and returns
# a list of words in the line
def split_line(line):
split = re.findall('[A-Za-z]+(?:\'\"[A-Za-z]+)?', line)
return split
# Opens the dictionary text file and adds each line to an array, then closes the file
dictionary = open("dictionary.txt")
dict_array = []
for item in dictionary:
dict_array.append(split_line(item))
print(dict_array)
dictionary.close()
print("---Linear Search---")
# Opens the text for the first chapter of Alice in Wonderland
chapter_1 = open("AliceInWonderland200.txt")
# Breaks down the text by line
for each_line in chapter_1:
# Breaks down each line to a single word
words = split_line(each_line)
# Checks each word against the dictionary array
for each_word in words:
i = 0
# Continues as long as there are more words in the dictionary and no match
while i < len(dict_array) and each_word.upper() != dict_array[i]:
i += 1
# if no match was found print the word being checked
if not i <= len(dict_array):
print(each_word)
# Closes the first chapter file
chapter_1.close()
答案 0 :(得分:0)
在Python中查找拼写错误的线性搜索
这样的事情应该做(伪代码)
sampleDict = {}
For each word in AliceInWonderLand200.txt:
sampleDict[word] = True
actualWords = {}
For each word in dictionary.txt:
actualWords[word] = True
For each word in sampleDict:
if not (word in actualDict):
# Oh no! word isn't in the dictionary
set可能比dict更合适,因为样本中字典的值并不重要。这应该可以帮助您,但