说我的输入文件如下所示:
lines
BeginModeData apple
lines
EndModuleData
BeginModeData banana
lines
EndModuleData
BeginModeData orange
lines
EndModuleData
...
我想删除所有属于“banana”的行,所以它看起来像这样:
lines
BeginModeData apple
lines
EndModuleData
BeginModeData orange
lines
EndModuleData
...
到目前为止,我的python代码几乎可以工作,但它也是任何其他“EndModuleData”,这不是我想要的:
linelist = open("infile.txt").readlines()
newfile = open('outfile', 'w')
flag = 1
for line in linelist:
if line.startswith("BeginModeData banana"):
flag = 0
if line.startswith("EndModuleData"):
flag = 1
if flag and not line.startswith("EndModuleData"):
newfile.writelines(line)
如何改进我的小代码以使其工作?谢谢你的帮助。
答案 0 :(得分:2)
试试这个:
flag = 1
for line in linelist:
if line.startswith("BeginModeData banana"):
flag = 0
if flag:
newfile.write(line)
if line.startswith("EndModuleData"):
flag = 1
作为旁注,最好在处理文件对象时使用with
关键字。这样做的好处是文件在套件完成后正确关闭,即使在途中引发了异常:
with open("infile") as infile, open("outfile", "w") as outfile:
for line in infile:
...
答案 1 :(得分:0)
您可以在一个字符串中读取整个文件,并使用Python的正则表达式模块re
来替换整个模式:
s = open("infile.txt").read() # read everything into a single multiline string
newfile = open('outfile', 'w')
new_s = re.sub('BeginModeData banana(\n.*?)*?\nEndModuleData\n', '', s, flags=re.MULTILINE)
# match the replacement pattern non-greedily (*?) not to match all the way to the end
new_file.write(new_s)
new_file.close()