Python:尝试读取和写入多个文件时出现问题

时间:2014-06-16 11:56:25

标签: python regex iterator

此脚本读取和写入目录中的所有单个html文件。该脚本重复,突出显示并写入输出。问题是,在突出显示搜索项的最后一个实例后,脚本将删除每个文件输出中最后一个搜索实例之后的所有剩余内容。感谢您的任何帮助。

import os
import sys
import re

source = raw_input("Enter the source files path:")

listfiles = os.listdir(source)

for f in listfiles:
    filepath = os.path.join(source+'\\'+f)
    infile = open(filepath, 'r+')
    source_content = infile.read()

    color = ('red')
    regex = re.compile(r"(\b in \b)|(\b be \b)|(\b by \b)|(\b user \b)|(\bmay\b)|(\bmight\b)|(\bwill\b)|(\b's\b)|(\bdon't\b)|(\bdoesn't\b)|(\bwon't\b)|(\bsupport\b)|(\bcan't\b)|(\bkill\b)|(\betc\b)|(\b NA \b)|(\bfollow\b)|(\bhang\b)|(\bbelow\b)", re.I)

    i = 0; output = ""
    for m in regex.finditer(source_content):
        output += "".join([source_content[i:m.start()],
                           "<strong><span style='color:%s'>" % color[0:],
                           source_content[m.start():m.end()],
                           "</span></strong>"])

        i = m.end()
    outfile = open(filepath, 'w')
    outfile.seek(0, 2)
    outfile.write(output)
    print "\nProcess Completed!\n"
    infile.close()
    outfile.close()


raw_input()

1 个答案:

答案 0 :(得分:2)

for循环结束后,您需要包含最后一次匹配后剩下的内容:

        ...
        i = m.end()
    output += source_content[i:])  # Here's the end of your file
    outfile = open(filepath, 'w')
    ...