Python - 重复的文本块

时间:2017-12-18 15:34:25

标签: python perl

我是Python的新手,但我已经使用Perl一段时间了。在Perl中,为了将文件的搜索限制为特定的文本块,我会写下面的内容:

if (/start_line/ ... /end_line/) {
   do something here
}

条件/start_line/ ... /end_line//start_line/正则表达式匹配后开始为真,然后在/end_line/正则表达式匹配之后继续为真。在逐行读取输入的循环中,这将为起始行和结束行之间的所有行执行if-block。

如何在Python中表达相同的条件?

4 个答案:

答案 0 :(得分:2)

如果你尝试这样的话怎么办?

start_line = "line 1"
end_line = "line 2"
in_block = False
line_block = []

with open("file.txt") as search:
    for line in search:
        line = line.rstrip()  # remove '\n' at end of line
        if line == start_line:
            in_block = True
        elif line == end_line:
            line_block.append(line)
            in_block = False

       if in_block:
           line_block.append(line)

答案 1 :(得分:2)

你指的是Perl的触发器操作符(..)。基本上,它在遇到第一个条件时将布尔标志设置为true,并在遇到第二个条件(包括起始和结束行)后将其设置为false。以这种方式看待它,实现起来相当简单。

import re

flip = False;
for line in open(filename):
    if not flip and re.match('start-text',line): flip = True
    if flip:
        print(line)
        if re.match('end-text',line): flip = False

答案 2 :(得分:0)

我最初编写了一个从第一个产生线的发电机 仅匹配块 (请参阅编辑历史记录),但是编辑了我 重新考虑我的第一个提案,因为预期的行为是 在每个匹配的文本块上执行if正文。

我的新提案同样是一个生成器,当然,默认情况下 产生来自匹配文本块的行"永远"但是,用一个 可选的关键字参数,也可以处理匹配的块a 最大数量(count)次。

概念证明如下

def from_beg_to_end(filename, beg, end, count=0):

    '''Yields the lines from `filename` like in `sed -n /beg/,/end/p`

    By default (for `count=0`) if the file contains multiple blocks
    all the blocks are output, for `count` greater than zero the number
    of blocks whose lines are returned is _at most_ `count.

    Example of use:

    for line in from_beg_to_end(filename, 'a', b'):
        ...```

    inside = False
    for line in open(filename):
        if not inside:
            if beg in line: inside = True
        if inside:
            yield line
            if end in line:
                count = count-1
                if count==0: return
                inside = False

使用简单的字符串匹配。应该很容易调整代码 以上是为了支持正则表达式。

答案 3 :(得分:0)

Perl解决方案实现了触发器操作器,它在连续循环之间保持状态。其他解决方案已经通过更新标志变量实现了这一点。还可以编写一个类,以便先前匹配的状态保留在实例变量中。下面给出一个例子。这是我最接近优雅的Perl语法

#Fileblock.py
import re

class Block_Extract:
    def __init__(self):
        self.state = False
    def test(self, lines, start, end):
        if not self.state:
            self.m1 = re.search(start, lines)
        self.m2 = re.search(end, lines)
        if self.m1 and not self.m2:
            self.state = True
            return self.state
        if self.m2:
            self.state = False
            return True


start = "line3"
end = "line7"
fileblock = Block_Extract()
with open("Block_Test") as fp:
    for lines in fp:
        lines = lines.rstrip()
        if fileblock.test(lines, start, end):
            print lines

$ cat Block_Test 
This is line1
This is line2
This is line3
This is line4
This is line5
This is line6
This is line7
This is line8
This is line9
This is line10
$ python Fileblock.py 
This is line3
This is line4
This is line5
This is line6
This is line7