所以,我有一个有3列的数据文件。我要做的是创建一个函数,将开始和结束行号作为输入。类似的东西:
def(start line number, end line number):
with open("data.txt", 'r') as f:
for line in f:
splitted_line = line.strip().split(",")
date1 = datetime.strptime(splitted_line[0],'%Y%m%d:%H:%M:%S.%f')
price = float(splitted_line[1])
volume = int(splitted_line[2])
my_tuple=(date1,price,volume)
答案 0 :(得分:1)
def func(start,end):
with open("data.txt", 'r') as f:
for idx,line in enumerate(f):
if idx == end:
break
if idx < start:
continue
splitted_line = line.strip().split(",")
date1 = datetime.strptime(splitted_line[0],'%Y%m%d:%H:%M:%S.%f')
price = float(splitted_line[1])
volume = int(splitted_line[2])
my_tuple=(date1,price,volume)
答案 1 :(得分:0)
如果我正确读取此内容,此功能应该只读取[start_line, end_line]
范围内编号的行(我假设这是一个包含范围,即你想读两者起点和终点线也是如此)。为什么不write your for loop with enumeration并且只是跳过超出传递范围的行?
def read_line_range_inclusive(start_line, end_line):
filename = "data.txt"
with open(filename) as f:
for i, line in enumerate(f):
if i < start_line: # will read the start line itself
continue # keep going...
if i > end_line: # will read the end line itself
break # we're done
# ... perform operations on lines ...
另外,用逗号分割时要小心;这适用于1,2,3
这样的简单行,但1,2,"a,b,c",3
怎么办?"abc"
不应该拆分成单独的列?我建议使用built-in csv module,它会自动处理这些边缘情况:
import csv
def read_line_range_inclusive(start_line, end_line):
filename = "data.txt"
with open(filename) as f:
for i, row in enumerate(csv.reader(f)):
# row will already be separated into list
# ... proceed as before ...
请注意,您只能对文件对象本身not on the csv.reader parsed file使用with
语句,因此这不起作用:with csv.reader(open(filename)) as f:
。
答案 2 :(得分:0)
如果您使用CSV阅读器,则可以访问行号:
csvreader.line_num
从源迭代器读取的行数。这不是 与返回的记录数相同,因为记录可以跨越多个 线。
答案 3 :(得分:0)
我们可以将linecache
模块和csv
结合起来完成工作:
import csv
import linecache
def get_lines(filename, start_line_number, end_line_number):
"""
Given a file name, start line and end line numbers,
return those lines in the file
"""
for line_number in range(start_line_number, end_line_number + 1):
yield linecache.getline(filename, line_number)
if __name__ == '__main__':
# Get lines 4-6 inclusive from the file
lines = get_lines('data.txt', 4, 6)
reader = csv.reader(lines)
for row in reader:
print(row)
考虑数据文件data.txt:
# this is line 1
# line 2
501,john
502,karen
503,alice
# skip this line
# and this, too
上面的代码将产生以下输出:
['501', 'john']
['502', 'karen']
['503', 'alice']
linecache
是一个鲜为人知的库,允许用户快速从文本文件中检索行csv
是一个处理逗号分隔值的库