我试图创建一个函数,在一个以(不需要显式)开头的文件中找到一些行,例如:" aaa一行继续句子&# 34;或者" iii另一个继续判决"并在另一个名为blacklist的文件中写下它找到的确切行。
例如,让我们说我的文件来自这个功能:
def writeletters(self):
outf = "xfile.txt"
alphabet = ['a','b','c','d','e','f', 'g', 'h' ,'i']
with open(outf, "w") as a:
i = 0
b = 5
while i < len(alphabet):
a.write((alphabet[i] * b) + '\n')
i += 1
输出结果为:
aaaaa
bbbbb
ccccc
ddddd
eeeee
fffff
ggggg
hhhhh
iiiii
我怎么才能得到以&#34; aaa&#34;开头的行输出?或&#34; iii&#34;发送或写入另一个文件?
bbbbb
ccccc
ddddd
eeeee
fffff
ggggg
hhhhh
为了尝试实现我想要的东西,我编写了黑名单功能,但显然不起作用
def blackList(self):
filep = "xfile.txt"
blacklist = ['aaa', 'iii']
i = 0
with open(filep) as bl:
for line in bl:
i + 1
if any(s in line for s in blacklist):
print blacklist[i]
答案 0 :(得分:2)
你可以大大简化这个
def blackList(self):
filep = "xfile.txt"
output = "output.txt"
blacklist = ['aaa', 'iii']
with open(filep, "r") as in_fh, open(output, "w") as out_fh:
to_write = []
for line in in_fh.readlines():
for bad_entry in blacklist:
if line.startswith(bad_entry): # keep bad lines
to_write.append(line)
out_fh.writelines(to_write)
对于一种尖锐但不太明显的方法,请尝试以下方法:
def blacklist_writer(input_file, output_file, blacklist):
with open(input_file, "r") as in_fh, open(output_file, "w") as out_fh:
# check l against blacklist in a nested generator
out_fh.write("".join(l for l in in_fh.readlines() if [b for b in blacklist if l.startswith(b)]))
它创建了一个生成器,用于检查input_file
中每一行与另一个生成器的对应关系,该生成器生成与黑名单匹配的每一行的列表。如果没有匹配项,则列表将为空,并且&#34; falsey&#34;。
答案 1 :(得分:0)
您可以使用正则表达式,但其上的模式将根据您尝试过滤的内容而有所不同。如果你真的只想过滤掉以3 a或3 i开头的行,你可以使用re.match()
:
import re
regex_pattern = 'a{3}|i{3}'
def writeletters(regex_pattern):
with open('xfile.txt', 'r') as file:
for line in file:
if re.match(regex_pattern, line):
print line #replace this line with code to write to file
regex_pattern
说“连续3个或者是我的”。 re.match()
将使用给定的正则表达式模式匹配任何字符串开始。
答案 2 :(得分:0)
我意识到我解决这个问题的原始尝试很接近。我只需要打印我的行而不是我的黑名单列表,所以我也会发布我的解决方案。 (愚蠢的半禁区错误)
def blackList(self):
filep = "xfile"
blacklist = ['aaa', 'iii']
out = "blacklist.txt"
with open(filep) as bl, open(out, "w") as output:
for line in bl:
if any(s in line for s in blacklist):
output.writelines(line)
实际写入没有列入黑名单行的原始文件的黑名单如下
def blackList(self):
filep = "xfile"
blacklist = ['aaa', 'iii']
out = "blacklist.txt"
with open(filep) as bl, open(out, "w") as output:
for line in bl:
if not any(s in line for s in blacklist):
output.writelines(line)