Question

我正在尝试编辑我在python中下载的特定html文件。我遇到了一个问题，我运行我的代码编辑文件和我的python上下文锁定。我检查了它写的文件，发现有两个文件。 html文件和.bak文件。

html文件从0kb开始，.bak文件不断增长到一个点，大概12 MB左右，然后.html文件将增长到更大的大小，然后.bak文件将再次增长。这似乎无休止地循环。我正在编辑的html文件是22kb。我看过输出文件只是为了看它是否会停止......它没有成就。

以下是我用来编辑文件的功能：

def replace(self, search_str, replace_str):
    f = open(self.path,'r+')
    content = f.readlines()
    for i, line in enumerate(content):
        content[i] = line.replace(search_str, replace_str)
    f.writelines(content)
    f.close()

这个问题，我想象的是这样一个事实，即下载的html文件大多在一行中，其中包含大约21,000个字符。有什么想法吗？

编辑：

我也尝试了另一个功能，但得到了相同的结果：

def replace(self, search_str, replace_str):
    assert self.path != None, 'No file path provided.'
    fi = fileinput.FileInput(self.path,inplace=1)
    for line in fi:
        if search_str in line:
             line=line.replace(search_str,replace_str)
        print line
    fi.close()

Answer 1

尝试使用发电机。如果你需要阅读一个大文件，这就是你想要的方法

for line in open(self.path,'r+'):
    # do stuff with line

Answer 2

我重新编写了函数，将所有内容写入新文件，然后才能正常工作。

def replace(self, search_str, replace_str):
    f = open(self.path,'r+')
    new_path = self.path.split('.')[0]+'.TEMP'
    new_f = open(new_path,'w')
    new_lines = [x.replace(search_str, replace_str) for x in f]
    new_f.writelines(new_lines)
    f.close()
    new_f.close()
    os.remove(self.path)
    os.rename(new_path, self.path)

Python编辑文件有一个疯狂的长行

2 个答案: