Question

如何在约612 MB的文本文件中打印最后一行，并且有大约400万行文本由This is a line组成。到目前为止，我有：

File.py

f = open("foo.txt","r+")
datalist = []
for line in f:
    datalist.append(line)
print(datalist[-1])

我在代码中看到的唯一问题是它使用了大量内存。我听说有人使用os.lseek，但我不知道如何实现它。

Answer 1

如果你只需要最后一行，就扔掉其他所有东西。

with open('foo.txt') as f:
    for line in f:
        pass

# `line` is the last line of the file.

要快得多（但可读性差得多），可以从文件末尾开始，然后按字节向后移动，直到找到\n，然后阅读。

with open('foo.txt') as f:
    fd = f.fileno()
    os.lseek(fd, 0, os.SEEK_END)
    while True:
        ch = os.read(fd, 1)
        if ch == b'\n':
            line = f.read()
            break
        else:
            os.lseek(fd, -2, os.SEEK_CUR)

# `line` is the last line of the file

这是通过从最后读取文件，寻找第一个换行符，然后从那里向前读取来实现的。

Answer 2

这是一个非常简单的改进，一次只存储一行：

f = open("foo.txt","r")
data = None
for line in f:
    data = line
print(data)

或者您可以在循环后获取最终循环值：

f = open("foo.txt","r")
line = None
for line in f:
    pass
print(line)

请注意，在此示例中，line如果文件为空（这是初始分配给None的原因），则为line。

Answer 3

快速改进就是抛出datalist而只保存最近的一行，因为这就是你所关心的。

f = open("foo.txt","r+")
for line in f:
    pass
print(line)

我想也有其他更有效的方法;我只想提供一个直接衍生代码的方法。

Answer 4

您无需将每行附加到列表中。只需使用循环变量：

line = None  # prevents a NameError if the file is empty

with open("foo.txt", "r+") as f: 
    for line in f:
        pass
print(line)

Answer 5

在集合模块中查看 deque 。有一个查看最后一个＆＃39; n＆＃39;文件中的行数;即尾巴。

https://docs.python.org/2/library/collections.html#deque-recipes

def tail(filename, n=10):
    'Return the last n lines of a file'
    return deque(open(filename), n)

我如何打印大文本文件中的最后一行？

5 个答案: