我想在.txt文件中搜索"列表"单词并打印txt中包含单词表中任何单词的任何行。
我首先使用.split()
拆分raw_input
(称为userInput
)并获得了一个单词列表。之后,我用另一个黑名单wordlist过滤了当前的wordlist,得到了最终过滤的wordlist。在这种情况下,我想在文本文件中搜索任何单词。
exWords = ['Who', 'How', 'What', 'How many', 'How much', 'am', 'is', 'are', '?', '!']
while True:
userInput = raw_input("> ")
uqWords = userInput.split()
fqWords = [word for word in uqWords if not any(bad in word for bad in exWords)]
我将userInput
分开.split()
并将其称为uqWords
后,我将其从exWords
列表中的任何字词中过滤掉,并调用输出fqWords
。现在,我想在Database.txt
列表中搜索fqWords
列表中的任何字词并打印行。
指定;我的完整代码是:
import time
import random
Error = ["Sorry, I don't understand.", "I don't get it"]
exWords = ['Who', 'How', 'What', 'How many', 'How much', 'am', 'is', 'are', '?', '!']
R = "Rel > "
while True:
userInput = raw_input("> ")
uqWords = userInput.split()
fqWords = [word for word in uqWords if not any(bad in word for bad in exWords)]
DB = open("Database.txt")
for line in DB:
if fqWords in line:
print (R + line[:-1])
CDB = open("CodeDB.txt")
for code in CDB:
if fqWords in code:
print (R + code[:-1])
break
if fqWords not in (code and line):
randomError = random.choice(Error)
print (R + (randomError))
答案 0 :(得分:3)
尝试使用此功能:
def search_for_lines(filename, words_list):
words_found = 0
with open(filename) as db_file:
for line_no, line in enumerate(db_file):
if any(word in line for word in words_list):
print(line_no, ':', line)
words_found += 1
return words_found
只需传递您要搜索的文件名和单词列表,它就会打印行号以及行内容,并返回与任何单词一起找到的行数。当文件遍历每一行时,enumerate将为您提供行号和行本身的元组。
要将此添加到现有代码并搜索两个文件,您需要先声明它,然后在分配fqWords
之后立即调用它:
import random
def search_for_lines(filename, words_list):
words_found = 0
with open(filename) as db_file:
for line_no, line in enumerate(db_file):
if any(word in line for word in words_list):
print(line_no, ':', line)
words_found += 1
return words_found
Error = ["Sorry, I don't understand.", "I don't get it"]
exWords = ['Who', 'How', 'What', 'How many', 'How much', 'am', 'is', 'are', '?', '!']
R = "Rel > "
while True:
userInput = raw_input("> ")
uqWords = userInput.split()
fqWords = [word for word in uqWords if not any(bad in word for bad in exWords)]
search_for_lines("Database.txt", fqWords)
words_found = search_for_lines("CodeDB.txt", fqWords)
if words_found > 0:
break
else:
randomError = random.choice(Error)
print (R + (randomError))
答案 1 :(得分:0)
如果您不需要修改列表,请使用tuple
。对于命名标识符,请参阅PEP 8
要获得序列的差异,请使用set
,f.e。 {1,2,3} - {2,3}
是{1}
如果你在循环中open
个相同的文件,它会在每次迭代中打开,所以最好将它们移出循环。
import random
def get_line_with_words(lines, words):
"""returns list of lines if any of the words
in any of the lines
"""
return [(i, line.strip()) for i, line in enumerate(lines,1) if any(word in line for word in words)]
errors = ("Sorry, I don't understand.", "I don't get it")
ex_words = ('Who', 'How', 'What', 'How many', 'How much', 'am', 'is', 'are', '?', '!')
prefix = "Rel > "
with open("Database.txt") as db, open("CodeDB.txt") as cdb:
while True:
user_input = raw_input("> ")
uq_words = user_input.split()
fq_words = frozenset(uq_words) - frozenset(ex_words)
res1 = get_line_with_words(db, fq_words)
res2 = get_line_with_words(cdb, fq_words)
if res1 and res2:
for n, line in res1 + res2:
print('{} {} {}'.format(prefix, n, line)
break
print('{} {}'.format(prefix, random.choice(errors)))
db.seek(0)
cdb.seek(0)