如何使用正则表达式从文件中复制节?

时间:2018-08-02 00:09:15

标签: python regex

我有一个具有以下结构的文件:

******
Block 1
text
text
...
End 
******
Block 2
text
text
...
End 
******
Block 3
text
text
...
End 
******

,依此类推。我想打开文件读取每一行,并将第一个块的信息保存在字符串中。这就是我到目前为止所拥有的。

Block = ''
with open(File) as file:
        for line in file:
            if re.match('\.Block.*', line):
                Block += line
            if 'str' in line:
                break
    print (Block)

但是,当我打印“块”时,我得到:

Block 1
Block 2
...

如何使用正则表达式将代码行从Block 1复制到End? 谢谢

3 个答案:

答案 0 :(得分:1)

您可以使用itertools.groupby

import itertools, re
lines = [i.strip('\n') for i in open('filename.txt')]
first_result, *_ = [list(b) for a, b in itertools.groupby(lines, key=lambda x:bool(re.findall('^\*+$', x))) if not a]
print(first_result)

输出:

['Block 1', 'text', 'text', '...', 'End ']

答案 1 :(得分:0)

您仅在匹配正则表达式'.Block。*'的行上进行匹配。如果要分配每个块中的值,则必须做更多的工作。

Block = ''
Match = False
with open(File) as file:
        for line in file:
            if re.match('^End$', line):
                Match = False
            if re.match('\.Block.*', line) or Match:
                Match = True
                Block += line
            if 'str' in line:
                break
    print (Block)

答案 2 :(得分:0)

with open(File) as ff:
        txt=ff.read() # reading the whole file in

re.findall(r"(?ms)^\s*Block\s*\d+.*?^\s*End\s*$",txt)

 Out: 
        ['Block 1\ntext\ntext\n...\nEnd ',
         'Block 2\ntext\ntext\n...\nEnd ',
         'Block 3\ntext\ntext\n...\nEnd ']

        Or change '\d+' to '1' to get the 1st one. 
        (?ms): m: multiline mode, that we can apply ^ and $ in each line,
               s: '.' matches newline,too.
        ?: non-greedy mode in '.*?'