按行位置读取文件

时间:2017-03-09 01:46:34

标签: python function parsing

所以,我有一个有3列的数据文件。我要做的是创建一个函数,将开始和结束行号作为输入。类似的东西:

def(start line number, end line number):
    with open("data.txt", 'r') as f:
        for line in f:
            splitted_line = line.strip().split(",")
            date1 = datetime.strptime(splitted_line[0],'%Y%m%d:%H:%M:%S.%f')
            price = float(splitted_line[1])
            volume = int(splitted_line[2])
            my_tuple=(date1,price,volume)

4 个答案:

答案 0 :(得分:1)

def func(start,end):
    with open("data.txt", 'r') as f:
        for idx,line in enumerate(f):
          if idx == end:
            break 
          if idx < start:
            continue

          splitted_line = line.strip().split(",")
          date1 = datetime.strptime(splitted_line[0],'%Y%m%d:%H:%M:%S.%f')
          price = float(splitted_line[1])
          volume = int(splitted_line[2])
          my_tuple=(date1,price,volume)

答案 1 :(得分:0)

如果我正确读取此内容,此功能应该只读取[start_line, end_line]范围内编号的行(我假设这是一个包含范围,即你想读两者起点和终点线也是如此)。为什么不write your for loop with enumeration并且只是跳过超出传递范围的行?

def read_line_range_inclusive(start_line, end_line):
    filename = "data.txt"
    with open(filename) as f:
        for i, line in enumerate(f):
            if i < start_line: # will read the start line itself
                continue # keep going...
            if i > end_line: # will read the end line itself
                break # we're done

            # ... perform operations on lines ...

另外,用逗号分割时要小心;这适用于1,2,3这样的简单行,但1,2,"a,b,c",3怎么办?"abc"不应该拆分成单独的列?我建议使用built-in csv module,它会自动处理这些边缘情况:

import csv

def read_line_range_inclusive(start_line, end_line):
    filename = "data.txt"
    with open(filename) as f:
        for i, row in enumerate(csv.reader(f)):
            # row will already be separated into list
            # ... proceed as before ...

请注意,您只能对文件对象本身not on the csv.reader parsed file使用with语句,因此这不起作用:with csv.reader(open(filename)) as f:

答案 2 :(得分:0)

如果您使用CSV阅读器,则可以访问行号:

csvreader.line_num
  

从源迭代器读取的行数。这不是   与返回的记录数相同,因为记录可以跨越多个   线。

答案 3 :(得分:0)

我们可以将linecache模块和csv结合起来完成工作:

import csv
import linecache


def get_lines(filename, start_line_number, end_line_number):
    """
    Given a file name, start line and end line numbers,
    return those lines in the file
    """
    for line_number in range(start_line_number, end_line_number + 1):
        yield linecache.getline(filename, line_number)


if __name__ == '__main__':
    # Get lines 4-6 inclusive from the file
    lines = get_lines('data.txt', 4, 6)
    reader = csv.reader(lines)

    for row in reader:
        print(row)

考虑数据文件data.txt:

# this is line 1
# line 2

501,john
502,karen
503,alice

# skip this line
# and this, too

上面的代码将产生以下输出:

['501', 'john']
['502', 'karen']
['503', 'alice']

讨论

  • linecache是一个鲜为人知的库,允许用户快速从文本文件中检索行
  • csv是一个处理逗号分隔值的库
  • 通过组合它们,我们可以毫不费力地完成工作