您好我最近一直在尝试在Python 3中创建一个程序,它将读取包含23005个单词的文本文件,然后用户将输入一个字符串,包含9个字符,程序将使用该字符串创建单词并将它们与文本文件中的单词进行比较。
我想打印包含4-9个字母的单词,并且还包含列表中间的字母。例如,如果用户输入字符串“anitsksem”,那么单词中必须出现第五个字母“s”。
这是我自己走了多远:
# Open selected file & read
filen = open("svenskaOrdUTF-8.txt", "r")
# Read all rows and store them in a list
wordList = filen.readlines()
# Close File
filen.close()
# letterList index
i = 0
# List of letters that user will input
letterList = []
# List of words that are our correct answers
solvedList = []
# User inputs 9 letters that will be stored in our letterList
string = input(str("Ange Nio Bokstäver: "))
userInput = False
# Checks if user input is correct
while userInput == False:
# if the string is equal to 9 letters
# insert letter into our letterList.
# also set userInput to True
if len(string) == 9:
userInput = True
for char in string:
letterList.insert(i, char)
i += 1
# If string not equal to 9 ask user for a new input
elif len(string) != 9:
print("Du har inte angivit nio bokstäver")
string = input(str("Ange Nio Bokstäver: "))
# For each word in wordList
# and for each char within that word
# check if said word contains a letter from our letterList
# if it does and meets the requirements to be a correct answer
# add said word to our solvedList
for word in wordList:
for char in word:
if char in letterList:
if len(word) >= 4 and len(word) <= 9 and letterList[4] in word:
print("Char:", word)
solvedList.append(word)
我遇到的问题是,打印出包含至少一个的字词,而不是打印仅包含来自letterList
的字词的字词。来自letterList
的来信。这也意味着有些单词会多次打印出来,例如,如果单词包含来自letterList
的多个字母。
我一直试图解决这些问题一段时间,但我似乎无法弄明白。我也尝试使用排列来创建列表中所有可能的字母组合,然后将它们与我的wordlist
进行比较,但是我觉得解决方案是在给定必须创建的组合数量的情况下放慢速度。
# For each word in wordList
# and for each char within that word
# check if said word contains a letter from our letterList
# if it does and meets the requirements to be a correct answer
# add said word to our solvedList
for word in wordList:
for char in word:
if char in letterList:
if len(word) >= 4 and len(word) <= 9 and letterList[4] in word:
print("Char:", word)
solvedList.append(word)
另外,因为我对python很新,如果你有任何一般的技巧要分享,我真的很感激。
答案 0 :(得分:1)
你得到多个单词主要是因为你遍历给定单词中的每个字符,如果那个字符在letterList
中你附加并打印出来。
相反,迭代基于单词而不是基于字符,同时还使用with
上下文管理器自动关闭文件:
with open('american-english') as f:
for w in f:
w = w.strip()
cond = all(i in letterList for i in w) and letterList[4] in w
if 9 > len(w) >= 4 and cond:
print(w)
此处cond
用于减少if
语句,all(..)
用于检查单词中的每个字符是否都在letterList
,w.strip()
是删除任何多余的空白区域。
此外,要在输入为letterList
个字母时填充9
,请勿使用insert
。相反,只需将字符串提供给list
,列表将以类似但明显更快的方式创建:
此:
if len(string) == 9:
userInput = True
for char in string:
letterList.insert(i, char)
i += 1
可以写成:
if len(string) == 9:
userInput = True
letterList = list(string)
通过这些更改,不需要初始open
和readlines
,也不需要letterList
的初始化。
答案 1 :(得分:0)
你可以试试这个逻辑:
for word in wordList:
# if not a valid work skip - moving this check out side the inner for-each will improve performance
if len(word) < 4 or len(word) > 9 or letterList[4] not in word:
continue
# find the number of matching words
match_count = 0
for char in word:
if char in letterList:
match_count += 1
# check if total number of match is equal to the word count
if match_count == len(word):
print("Char:", word)
solvedList.append(word)
答案 2 :(得分:0)
您可以使用lambda函数来完成此操作。 我只是在这里提出一个POC,留给你把它转换成完整的解决方案。
filen = open("test.text", "r")
word_list = filen.read().split()
print("Enter your string")
search_letter = raw_input()[4]
solved_list = [ word for word in word_list if len(word) >= 4 and len(word) <= 9 and search_letter in word]
print solved_list