Question

我需要获取文件中前一行的值，并将其与当前行进行比较，因为我正在迭代文件。该文件是巨大的，所以我无法全部读取或随机访问linecache的行号，因为库函数仍然会将整个文件读入内存。

编辑我很抱歉忘记了我必须向后阅读文件。

EDIT2

我尝试了以下内容：

 f = open("filename", "r")
 for line in reversed(f.readlines()): # this doesn't work because there are too many lines to read into memory

 line = linecache.getline("filename", num_line) # this also doesn't work due to the same problem above.

Answer 1

只需在迭代到下一个

时保存上一个

prevLine = ""
for line in file:
    # do some work here
    prevLine = line

这将在您循环时将前一行存储在prevLine

编辑显然OP需要向后读取此文件：

经过一个小时的研究后，我多次在内存约束中做错了

Here你去Lim，那家伙知道他在做什么，这是他最好的主意：

一般方法＃2：读取整个文件，存储行的位置

使用这种方法，您还可以阅读整个文件一次，但是   而不是将整个文件（所有文本）存储在内存中，而只是存储   将二进制位置存储在每行开始的文件中。   您可以将这些位置存储在与该位置类似的数据结构中   在第一种方法中存储线。

如果您想要读取X行，则必须重新读取行中的行   文件，从您为该行开头存储的位置开始。

优点：几乎与第一种方法一样易于实施缺点：可以采取   一段时间来阅读大文件

Answer 2

@Lim，这是我写的方式（回复评论）

def do_stuff_with_two_lines(previous_line, current_line):
    print "--------------"
    print previous_line
    print current_line

my_file = open('my_file.txt', 'r')

if my_file:
    current_line = my_file.readline()

for line in my_file:

    previous_line = current_line
    current_line = line

    do_stuff_with_two_lines(previous_line, current_line)

Answer 3

我为这项任务写了一个简单的生成器：

def pairwise(fname):
    with open(fname) as fin:
        prev = next(fin)
        for line in fin:
            yield prev,line
            prev = line

或者，您可以使用itertools中的pairwise食谱：

def pairwise(iterable):
    "s -> (s0,s1), (s1,s2), (s2, s3), ..."
    a, b = itertools.tee(iterable)
    next(b, None)
    return itertools.izip(a, b)

读取文件python中的上一行

3 个答案: