在文本文件中的字符串之间提取信息

时间:2015-04-12 19:46:22

标签: python string text extract

我有一个数据文件结构如下:

handle:trial1

key_left:3172

key_up:

xcoords:12,12,12,15........

ycoords:200,200,206,210,210......

t:20,140,270,390.....

goalx:2

goaly:12

fractal:images/file.png

seen:true

pauseTimes:

fractal:images/file2.png

seen:False

pauseTimes:

...
...

我想仅提取goaly行之后的信息,直至pauseTimes行。如果我知道所有试验的goaly值,我可以在goaly:pauseTimes之间指定该行并提取数据,但我不会提前知道任何{{1}的价值1}}是动态生成的。

如何使用字符串goaly来标识该行,然后提取所有后续行,直到"goaly"行?

2 个答案:

答案 0 :(得分:0)

extracting = False
with open('path/to/file') as f:
    for line in f:
        if line.startswith('goaly:'):
            extracting = True
        if extracting:
            # I'm not really sure how you want to receive this
            # data, but that's what would go here....
        if line.startswith('pauseTimes:'):
            extracting = False

答案 1 :(得分:0)

您可以使用状态变量循环和跟踪是否关心线路。我喜欢使用生成器跟踪这样的解析状态,以使其与处理代码分开。对于您的示例,这是生成器:

def parse(infile):
    returning = False
    trial = None
    for line in infile:
        line = line.rstrip()
        if not line:
            continue

        if line.startswith('handle:'):
            trial = line[len('handle:'):]

        if line.startswith('goaly:'):
            returning = True
        elif line.startswith('pauseTimes:'):
            returning = False

        if returning:
            yield trial, line

以下是您将如何使用它:

for trial, line in parse(open('test.txt', 'r')):
    print(trial, line)

具有跟踪您所在的试用版的奖励功能。