如果在PYthon中满足条件,则跳过行直到文本文件中的下一个块

时间:2018-01-17 20:04:35

标签: python text io text-files

我有一个巨大的文本文件,包含" blocks"像这样:

block
object pen
fruit apple
people mike
block
electronic laptop
city dallas
fruit banana
object stapler
vehicle car
block
people george
fruit orange
vehicle truck
city austin
object hammer

在每个区块中,随机线上只有一个水果。每个块具有不同的行数。我想迭代这个文件,打印包括水果名字在内的所有内容,然后跳到下一个区块。一旦我在一个街区找到水果,检查下一行是否是水果是浪费时间。我只是想跳到下一个区块,但问题是我不知道该区块前面有多少条线。所以输出应该如下:

block
object pen
the fruit is: apple
block
electronic laptop
city dallas
the fruit is: banana
block
people george
the fruit is: orange

我可以用两种方式产生这个输出,一个:

flag = True
with open("sample.txt", "r") as f:
    for line in f.readlines():
        if line.split()[0] == 'fruit':
            print "the fruit is: " + line.split()[1]
            flag = False
        if line.split()[0] == 'block':
            flag = True
        if flag:
            print line

还有两个:

flag = False
with open("sample.txt", "r") as f:
    for line in f.readlines():
        if line.split()[0] == 'fruit':
            print "the fruit is: " + line.split()[1]
            flag = True
        if line.split()[0] == 'block':
            flag = False
        if flag:
            continue
        print line 

但这不是我想要的。我的代码仍会检查每一行是否为水果。我想在水果之后跳过这些线直到阻止,并从那里继续。我怎么能跳这个?

3 个答案:

答案 0 :(得分:2)

from itertools import takewhile, dropwhile

def not_block(line): return line != 'block\n'
def not_fruit(line): return not line.startswith('fruit ')

with open("sample.txt", "r") as f:
    while True:
        for line in takewhile(not_fruit, dropwhile(not_block(f)):
            print line.rstrip()
        fruitline = next(f, None)
        if fruitline:
            print "the fruit is: " + fruitline.split()[1]
        else:
            break

答案 1 :(得分:1)

您可以添加在找到block行后触发的内部循环。请注意,这假设您的数据格式正确(即每个块都有结果)。

with open('data.txt') as f:
    for line in f:
        line = line.strip()
        if line == 'block':
            print(line)
            for line in f:
                line = line.strip()
                if line.startswith('fruit '):
                    print('the fruit is:', line.split(None, 1)[1])
                    break
                else:
                    print(line)

如果数据文件真的很大,那么你可以做的另一件事就是更多参与,但速度要快得多mmapfind()一起使用。

答案 2 :(得分:0)

叫我疯了但是我想用list comprehensionzip()对此进行不同的挖掘:

with open("sample.txt", "r") as file:
    lines = [line.strip('\n') for line in file.readlines()]

blocks = [i for i, j in enumerate(lines) if j == 'block']
fruits = [i for i, j in enumerate(lines) if 'fruit' in j]

for i, j in zip(blocks, fruits):
    print('\n'.join(lines[i:j+1]))

<强>输出:

block
object pen
fruit apple
block
electronic laptop
city dallas
fruit banana
block
people george
fruit orange

但是,如果每个block在下一个fruit之前始终跟block,则

它看起来很漂亮,好吧。不要质疑我选择的武器......