Question

有1000多个html文件。

我想这样做：

读取文件。
修剪特定行。
过度写入文件（不附加）。

以下代码有效。但我觉得用'打开＆＃39;两次是浪费。我可以写得更简单吗？

for file_path in glob.glob(os.path.join(dir, '*.html')):
    with open(file_path, "r", encoding="utf-8") as reader:
        html_ = reader.read()
        replaced = html_.replace("<?xml version=\"1.0\" encoding=\"UTF-8\" ?>", "")
        with open(file_path, "w", encoding="utf-8") as writer:
            writer.write(replaced)

我试过了：

'r+'：这是补充。
'w+'：read()方法返回''。

Answer 1

是的，在'r+'模式下打开文件，然后在阅读后“快退”（seek重新开始）：

with open(file_path, "r+", encoding="utf-8") as f:
    html_ = f.read()
    f.seek(0)
    replaced = html_.replace("<?xml version=\"1.0\" encoding=\"UTF-8\" ?>", "")
    f.write(replaced)
    f.truncate()

我还添加了file.truncate()调用，因为您要从文件中删除数据。如果没有该调用，您将不会替换文件中的所有数据，最后您仍然会有len(removed_data)个字节。

您的尝试失败了，因为您在使用'r+'时没有回头开始（因此在读取停止时开始写入，即文件结束），'w+'截断文件 first （因此将长度设置为0，删除内容）。

另一种方法是使用fileinput module;它允许您使用更简单的方法替换文件内容：

import fileinput

with fileinput.input(file_path, inplace=True, openhook=fileinput.hook_encoded("utf-8")) as f:
    html_ = f.read()
    replaced = html_.replace("<?xml version=\"1.0\" encoding=\"UTF-8\" ?>", "")
    print(replaced, end='')

使用inplace=True，旧文件将移至一个<filename>.bak备份，打印将输出定向到在原始位置打开的新文件。

我可以使用＆＃39; open＆＃39;进行阅读和覆盖只有一次？

1 个答案: