在python中对单个文件运行多个不同的re.subs

时间:2015-10-20 20:58:21

标签: python regex python-2.7

所以我有一个我的服务器生成的文件,我正在尝试清理它,并删除行,行的开头和结尾不必要的额外和不同的字符。我遇到的问题是我必须在每个re.sub之后创建一个新文件,然后删除旧文件。我现在有大约10个re.subs,我觉得创建和删除文件是低效的。

def linecleanup():

    file_in = open('Server.txt', 'r')
    file_out = open ("Server.txt1", "w")
    lines = file_in.read()
    regex = re.sub("\s\s\s\s\<revision>", "Revision: ", lines)
    file_out.write(regex)
    file_in.close
    file_out.close

   os.remove('Server.txt')

linecleanup()

def linecleanup1():

    file_in = open('Server.txt1', 'r')
    file_out = open ("Server.txt2", "w")
    lines = file_in.read()
    regex = re.sub("</version>", " ", lines)
    file_out.write(regex)
    file_in.close
    file_out.close

    os.remove('Server.txt1')

linecleanup1()

def linecleanup2():

    file_in = open('Server.txt2', 'r')
    file_out = open ("Server.txt3", "w")
    lines = file_in.read()
    regex = re.sub("</revision>", " " + '\n', lines)
    file_out.write(regex)
    file_in.close
    file_out.close

    os.remove('Server.txt2')

linecleanup2()

1 个答案:

答案 0 :(得分:0)

此代码未经过测试,但您可以执行以下操作:

def linecleanup():

    with open("Server.txt1", 'r') as file_in:
        lines = file_in.read()        

    lines = re.sub("\s\s\s\s\<revision>", "Revision: ", lines)
    # Where a simple find&replace (non-regex) is required, you could just use this instead:
    # lines = lines.replace("</version>"," ")
    lines = re.sub("</version>", " ", lines)  
    lines = re.sub("</revision>", " " + '\n', lines)    

    with open("Server.txt3", "w") as outp:
        outp.write(lines)

linecleanup()