Question

我有一个看起来像这样的文本文件：

Time: 2014030218

Add: India
Name: Sourav

 k io res tec e
 1 4  3   9   2
 2 3  1   4   4

Time: 2014030300

Add: India
Name: Sourav

 k io res tec e
 1 3  4   8   3
 2 2  2   6   4
 3 2  3   6   6

我希望有一个类似于此的新文件，但跳过每个时间步的Add和Name信息（对于每个时间步都是常见的）。此外，我希望只有那些满足res和tec列限制的行，类似于1<=res<=3 and 4<=tec<=7。

所以看起来应该是这样的。

Time: 2014030218

 k io res tec e
 1 3  1   4   4

Time: 2014030300

 k io res tec e
 1 2  2   6   4
 2 2  3   6   6

注意：k是序列号。

Answer 1

将文件的每一行读入变量line并使用line.startswith('Add: ')和line.startswith('Name: ')的测试，并在写出其他行时跳过这些行。跟踪您是否在k io ...行之后的行中检查实际值（如果合适，请再次跳过：

with open('input.txt') as ifp:
    with open('output.txt', 'w') as ofp:
        tec_seen = False
        empty_line = False
        for line in ifp:
            start_word = line.split(':', 1)[0]
            if start_word in ('Add', 'Name'):
                continue
            if start_word == 'Time':
                tec_seen = False
            if line.lstrip().startswith('k io res tec e'):
                tec_seen = True
                ofp.write(line)
                continue
            if tec_seen and line.strip():
                vals = line.split()
                res = int(vals[2])
                tec = int(vals[3])
                if not 1 <= res <= 3:
                    continue
                if not 4 <= tec <= 7:
                    continue
            else:
                if not line.strip():
                    if empty_line:
                        continue
                    empty_line = True
                else:
                    empty_line = False
            ofp.write(line)

读取python中的格式化文件

1 个答案: