Question

我正在大文本文件中找到图案，然后删除图案并打印文本。但是，我试图用new_text替换不带模式的文件中的旧文本（带模式）。

我正在使用正则表达式软件包，无论我尝试什么都行不通

我的命令.replace不起作用。

import re

rgx_list = ['Read More',
            'Read',
            'And follow us on Twitter to keep up with the latest news and and acute and primary Care.',...]

txt_path = '/Users/sofia/Documents/src/fakenews1/data/news-data/war_sc_r.txt'

with open(txt_path) as new_txt_file:
    new_text = new_txt_file.read()

for rgx_match in rgx_list:
        new_text = re.sub(rgx_match, '', new_text)

new_text.replace(txt_path, new_text)

print(new_text)

谢谢！

Answer 1

我不明白您为什么使用replace()函数。也许阅读str.replace()的文档。如果要将new_text写入文件，则应以write模式再次打开文件，然后将新内容写入文件。

import re

rgx_list = ['Read More',
            'Read',
            'And follow us on Twitter to keep up with the latest news and and acute and primary Care.']

txt_path = 'newyorktimes_test.txt'

with open(txt_path) as new_txt_file:
    new_text = new_txt_file.read()

for rgx_match in rgx_list:
        new_text = re.sub(rgx_match, '', new_text)

with open(txt_path, 'w') as new_txt_file:
  new_txt_file.write(new_text)

如果您正在寻找一种更in place editing的方法，则可以使用库来帮助您完成Google搜索。

如何在python中使用正则表达式包替换文本文件？

1 个答案: