Question

来自具有以下结构的文件：

..............................
Delimiter [1]
..............................
blablabla
..............................
Delimiter CEO [2]
..............................
blabla
..............................
Delimiter [3]
..............................

[...]

..............................
Delimiter CEO [n-1]
..............................
blablabla
..............................
Delimiter [n]
..............................

我写了一段代码，提取了所有定界符，但也提取了一些我不需要的行。我不需要的那些行会导致my code不能正确运行。我想在新的.txt文件中保存一行，如果该行中有正则表达式“ [a number]”。因此，为了更精确地提取，我使用re：在python中编写了此代码（紧跟this answer之后）：

import re
with open('testoestratto.txt','r',encoding='UTF-8') as myFile:
    text = myFile.readlines()
    text = [frase.rstrip('\n') for frase in text]
    regex = r'\[\d+\]'
    new_file=[]
    for lines in text:
       match = re.search(regex, lines, re.MULTILINE)
       if match:            
           new_line = match.group() + '\n'            
           new_file.append(new_line)

with open('prova.txt', 'w') as f:     
     f.seek(0)    
     f.writelines(new_file)

但是，在'prova.txt'文件中，我只能找到正则表达式，因此我有一个带有[1]，[2]，... [n-1]，[n]的文件。

Answer 1

您的new_file是文件中找到的匹配项的列表（您用match.group() +换行符填充。）

您可以检查一行中是否有\[\d+]个匹配项，并将该行输出到新文件中：

import re

reg = re.compile(r'\[\d+]') # Matches a [ char, followed with 1+ digits and then ]

with open('prova.txt', 'w') as f:     # open file for writing
    with open('testoestratto.txt','r',encoding='UTF-8') as myFile: # open file for reading
        for line in myFile:           # read myFile line by line
            if reg.search(line):      # if there is a match anywhere in a line
                f.write(line)         # write the line into the new file

正则表达式，在一行中找到一个匹配项并打印

1 个答案: