Question

我想在我的程序每次找到正则表达式时添加新行。我想保留正则表达式，并且在它之后仅换行。从.txt文件中读取文本。我可以找到正则表达式，但是当我尝试添加新行时，它会在实际输出中返回如下所示。我已经尝试修复了几个小时，很高兴获得帮助。

这是一个简单的例子：

在：

STLB 1234 444 text text text
STLB 8796 567 text text text

编辑位置：

STLB 1234 444text text text

STLB 8796 567text text text

想要的输出：

STLB 1234 444
text text text
STLB 8796 567
text text text

实际输出：

(STLB.*\d\d\d) 

(STLB.*\d\d\d)

这是我的代码：

stlb_match = re.compile('|'.join(['STLB.*\d\d\d']))

with open(in_file5, 'r', encoding='utf-8') as fin5, open(out_file5, 'w', encoding='utf-8') as fout5:
    lines = fin5.read().splitlines()

    for i, line in enumerate(lines):
        matchObj1 = re.match(start_rx, line)

        if not matchObj1:
            first_two_word = (" ".join(line.split()[:2]))

            if re.match(stlb_match,line):
                line =re.sub(r'(STLB.*\d\d\d)', r'(STLB.*\d\d\d)'+' \n', line)
            elif re.match(first_two_word, line):
                line = line.replace(first_two_word, "\n" + first_two_word)

        fout5.write(line)

Answer 1

假设各行始终采用STLB <number> <number> <text>格式，则可以执行以下操作：

代码

with open(in_file5, 'r', encoding='utf-8') as fin5, open(out_file5, 'w', encoding='utf-8') as fout5:
    for l in fin5:
      l = re.sub(r'(STLB\s*\d+\s*\d+)\s*', r'\1\n', l)

      fout5.write(l)
      fout5.write('\n')

输入

STLB 1234 444 text text text
STLB 8796 567 text text text

输出

STLB 1234 444
text text text

STLB 8796 567
text text text

请注意RegEx末尾的\s*，但是捕获组在此之前结束，因此那些尾随的空格被忽略了。

使用列表理解和`writelines`

with open(in_file5, 'r', encoding='utf-8') as fin5, open(out_file5, 'w', encoding='utf-8') as fout5:
    fout5.writelines([re.sub(r'(STLB\s*\d+\s*\d+)\s*', r'\1\n', l) for l in fin5])

让我知道这是否对您有用

Answer 2

您的替换零件是错误的，您不能在其中放置正则表达式。更改为：

line = 'STLB 1234 444 text text text'
line = re.sub(r'(STLB.*\d\d\d)', r"\1\n", line)
print line

输出：

STLB 1234 444
 text text text

或者：

line = re.sub(r'(STLB.*\d\d\d) ', r"\1\n", line)

如果要删除第二行开头的空格

在正则表达式后添加新行

2 个答案:

代码

输入

输出

使用列表理解和`writelines`

在正则表达式后添加新行

2 个答案:

代码

输入

输出

使用列表理解和writelines

使用列表理解和`writelines`