Question

我是Python的新手，我需要以一种特殊的方式清理文件中的标头，因为我的标头目前没有标准，我正在尝试提出此脚本以在多个实例中重用

示例文件：

*_____________________________
* This is header text
* For details, see foobar.txt.
*_____________________________
*
*

* Code goes here
Code = x

我必须这样做的方法是定义标题的开始和结束位置，然后在添加新标题之前擦洗介于两者之间的所有内容（包括起点/终点）。

当前我正在尝试使用

start_pos = r"*_____________________________"
end_pos = r"""*_____________________________
    *
    *"""

，然后在中间搜索所有内容。然后，我想完全合并，然后删除/替换以使新文件如下所示：

*
* Hello, world.
*

* Code goes here
Code = x

Answer 1

在这里：

\*_____________________________([\s\S]*?)\*_____________________________(?:\n\*){2}

Demo

要匹配中间的内容，我们可以使用修改后的“点” [\s\S]，该点匹配包括换行符在内的所有内容。 “点”与惰性匹配，以避免过多匹配。

Sample Code：

import re
regex = r"\*_____________________________([\s\S]*?)\*_____________________________(?:\n\*){2}"
test_str = ("*_____________________________\n"
    "* This is header text\n"
    "* For details, see foobar.txt.\n"
    "*_____________________________\n"
    "*\n"
    "*\n\n"
    "* Code goes here\n"
    "Code = x\n")
subst = "*\\n* Hello, world.\\n*"

# You can manually specify the number of replacements by changing the 4th argument
result = re.sub(regex, subst, test_str, 0, re.MULTILINE)

使用Python中的开始/结束标记搜索/替换标题

1 个答案: