Python - 如果没有行,则读取块并中断

时间:2015-01-21 18:43:08

标签: python

这段代码逐行读取一个大文件,处理每一行,然后在没有新条目时结束该过程:

file = open(logFile.txt', 'r')
count = 0

     while 1:
         where = file.tell()
         line = file.readline()
         if not line:
             count = count + 1
             if count >= 10:
               break
             time.sleep(1)
             file.seek(where)
         else:
            #process line 

在我的经历中,逐行阅读需要很长时间,因此我尝试改进此代码以每次读取大量行:

from itertools import islice
N = 100000
with open('logFile.txt', 'r') as file:
    while True:
       where = file.tell()
       next_n_lines = list(islice(file, N)).__iter__()
       if not next_n_lines:
          count = count + 1
          if count >= 10:
             break
          time.sleep(1)
          file.seek(where)
       for line in next_n_lines:
        # process next_n_lines

除了结尾部分之外,它工作正常,即使文件中没有更多行,它也不会结束进程(打破while循环)。有什么建议吗?

1 个答案:

答案 0 :(得分:3)

原始代码已经一次读取大块文件,它一次只返回一行数据。您刚刚添加了一个冗余生成器,它使用文件对象的读取行功能一次获取10行。

除了少数例外,迭代文件中的行的最佳方法如下。

with open('filename.txt') as f:
    for line in f:
        ...

如果您需要在时间上迭代预设数量的行,请尝试以下操作:

from itertools import islice, chain

def to_chunks(iterable, chunksize):
    it = iter(iterable)
    while True:
        first = next(it)
        # Above raises StopIteration if no items left, causing generator
        # to exit gracefully.
        rest = islice(it, chunksize-1)
        yield chain((first,), rest)


with open('filename.txt') as f:
    for chunk in to_chunks(f, 10):
        for line in chunk:
            ...