如何阅读其余的行? - 蟒蛇

时间:2013-12-03 12:41:42

标签: python file-io

我有一个文件,它有一些标题行,例如

header1 lines: somehting something
more headers then
somehting something
----

this is where the data starts
yes data... lots of foo barring bar fooing data.
...
...

我通过循环并运行file.readlines()跳过了标题行,除了循环和连接其余行之外,我还能如何阅读剩下的行? < / p>

x = """header1 lines: somehting something
more headers then
somehting something
----

this is where the data starts
yes data... lots of foo barring bar fooing data.
...
..."""

with open('test.txt','w') as fout:
  print>>fout, x

fin = open('test.txt','r')
for _ in range(5): fin.readline();
rest = "\n".join([i for i in fin.readline()])

2 个答案:

答案 0 :(得分:3)

.readlines()一次性读取文件中的所有数据。第一次通话后没有更多的线路可供阅读。

您可能想要使用.readline()(无s,单数):

with open('test.txt','r') as fin:
    for _ in range(5): fin.readline()
    rest = "\n".join(fin.readlines())

请注意,因为.readlines()已经返回了一个列表,所以您不需要遍历这些项目。您也可以使用.read()读取文件的其余部分:

with open('test.txt','r') as fin:
    for _ in range(5): fin.readline()
    rest = fin.read()

或者,将文件对象视为可迭代,并使用itertools.islice()切片将iterable跳过前五行:

from itertools import islice

with open('test.txt','r') as fin:
    all_but_the_first_five = list(islice(fin, 5, None))

这会生成,而不是一个大字符串,但如果您逐行处理输入文件,那么通常最好。您可以直接在切片上循环并处理行:

with open('test.txt','r') as fin:
    for line in list(islice(fin, 5, None)):
        # process line, first 5 will have been skipped

不要将文件对象混合为可迭代的.readline();由文件对象实现的迭代协议使用内部缓冲区来确保.readline()不知道的效率;迭代后使用.readline()可能会在文件中进一步返回数据,而不是您期望的数据。

答案 1 :(得分:1)

略过前5行:

from itertools import islice

with open('yourfile') as fin:
    data = list(islice(fin, 5, None))
    # or loop line by line still
    for line in islice(fin, 5, None):
        print line