Question

我像这样解析.txt：

def parse_file(src):
    for line in src.readlines():
        if re.search('SecId', line):
            continue
        else:
            cols = line.split(',')
            Time = cols[4]
            output_file.write('{}\n'.format(
                          Time))

我认为cols是我可以使用索引的列表。虽然它成功打印出我想要的正确结果，但存在超出范围的错误：

文件“./tdseq.py”，第37行，在parse_file中时间= cols [4] IndexError：列表索引超出范围 make： * [all]错误1

我使用的数据：

I10.FE,--,xx,xxxx,13450,tt,tt,tt,33,22,22:33:44

Answer 1

没有看到数据，很难说。

可能的原因是您假设基于1的索引，如下所示：

foo,bar,baz,qux

将被索引为列表中的位置0,1,2,3。

顺便说一句，我强烈建议您使用csv模块解析文件。

Answer 2

您收到的是IndexError，因为cols中没有五个元素。也许文件中有空行？

另请注意，从文件中获取行最好使用：

for line in src:

如果你正在搜索一个简单的字符串，你不需要正则表达式，这就足够了：

if 'SecId' in line:
    continue

Answer 3

使用len(cols)检查。此外，您的输入数据表明time_index=3不是4：

from __future__ import print_function

def parse_file(input_file):
    time_index = 3
    for line in input_file:
        if 'SecId' not in line:
            cols = line.split(',')
            if len(cols) > time_index:
               time = cols[time_index]
               print(time, file=output_file)

使用str.split后处理数据

3 个答案: