Question

我想在Python中使用tail -F或类似的东西输出，而不会阻塞或锁定。我发现了一些非常古老的代码here，但我认为现在必须有更好的方法或库来做同样的事情。有人知道吗？

理想情况下，每次我想要更多数据时，我都会收到类似tail.getNewData()的内容。

Answer 1

非阻止

如果您使用的是Linux（因为Windows不支持调用select on files），您可以将子进程模块与select模块一起使用。

import time
import subprocess
import select

f = subprocess.Popen(['tail','-F',filename],\
        stdout=subprocess.PIPE,stderr=subprocess.PIPE)
p = select.poll()
p.register(f.stdout)

while True:
    if p.poll(1):
        print f.stdout.readline()
    time.sleep(1)

这将轮询输出管道以获取新数据，并在可用时将其打印出来。通常，time.sleep(1)和print f.stdout.readline()将替换为有用的代码。

禁止

您可以使用子进程模块而无需额外的选择模块调用。

import subprocess
f = subprocess.Popen(['tail','-F',filename],\
        stdout=subprocess.PIPE,stderr=subprocess.PIPE)
while True:
    line = f.stdout.readline()
    print line

这也会在添加新行时打印新行，但它会阻塞，直到尾程序关闭，可能是f.kill()。

Answer 2

使用sh module（pip install sh）：

from sh import tail
# runs forever
for line in tail("-f", "/var/log/some_log_file.log", _iter=True):
    print(line)

[更新]

由于带有_iter = True的sh.tail是一个生成器，你可以：

import sh
tail = sh.tail("-f", "/var/log/some_log_file.log", _iter=True)

然后你可以用“getNewData”：

new_data = tail.next()

请注意，如果尾部缓冲区为空，它将阻塞，直到有更多数据（从你的问题来看，在这种情况下你不清楚你想做什么）。

[更新]

如果你用-F替换-f，这是有效的，但在Python中它会锁定。如果可能的话，我会更有兴趣拥有一个我可以调用以获取新数据的函数。 - Eli

容器生成器将尾调用置于一段时间的True循环中并捕获最终的I / O异常将具有与-F几乎相同的效果。

def tail_F(some_file):
    while True:
        try:
            for line in sh.tail("-f", some_file, _iter=True):
                yield line
        except sh.ErrorReturnCode_1:
            yield None

如果文件无法访问，则生成器将返回None。但是，如果文件可访问，它仍会阻塞，直到有新数据。我不清楚在这种情况下你想做什么。

Raymond Hettinger的方法似乎很不错：

def tail_F(some_file):
    first_call = True
    while True:
        try:
            with open(some_file) as input:
                if first_call:
                    input.seek(0, 2)
                    first_call = False
                latest_data = input.read()
                while True:
                    if '\n' not in latest_data:
                        latest_data += input.read()
                        if '\n' not in latest_data:
                            yield ''
                            if not os.path.isfile(some_file):
                                break
                            continue
                    latest_lines = latest_data.split('\n')
                    if latest_data[-1] != '\n':
                        latest_data = latest_lines[-1]
                    else:
                        latest_data = input.read()
                    for line in latest_lines[:-1]:
                        yield line + '\n'
        except IOError:
            yield ''

如果文件无法访问或没有新数据，此生成器将返回''。

[更新]

当数据耗尽时，倒数第二个答案会圈到文件的顶部。 - Eli

我认为只要尾部进程结束，第二行就会输出最后十行，只要出现I / O错误就会-f。 tail --follow --retry行为与我在类似unix的环境中可以想到的大多数情况相差不远。

也许如果你更新你的问题来解释你的真正目标是什么（你想要模仿尾部的原因），你会得到更好的答案。

最后一个答案实际上并没有遵循尾部，只是在运行时读取可用的内容。 - Eli

当然，tail会默认显示最后10行...你可以使用file.seek将文件指针放在文件的末尾，我会给读者留下一个适当的练习作为练习。

恕我直言，file.read（）方法比基于子流程的解决方案更优雅。

Answer 3

所以，这已经很晚了，但我又遇到了同样的问题，现在有一个更好的解决方案。只需使用pygtail：

Pygtail读取尚未读取的日志文件行。它甚至会处理已旋转的日志文件。基于logcheck的logtail2 （http://logcheck.org）

Answer 4

所有使用tail -f的答案都不是pythonic。

这是pythonic方法：（不使用任何外部工具或库）

def follow(thefile):
     while True:
        line = thefile.readline()
        if not line or not line.endswith('\n'):
            time.sleep(0.1)
            continue
        yield line



if __name__ == '__main__':
    logfile = open("run/foo/access-log","r")
    loglines = follow(logfile)
    for line in loglines:
        print(line, end='')

Answer 5

您可以使用“tailer”库：https://pypi.python.org/pypi/tailer/

它可以选择获取最后几行：

# Get the last 3 lines of the file
tailer.tail(open('test.txt'), 3)
# ['Line 9', 'Line 10', 'Line 11']

它也可以跟随文件：

# Follow the file as it grows
for line in tailer.follow(open('test.txt')):
    print line

如果想要一种类似尾巴的行为，那么这似乎是个不错的选择。

Answer 6

将Ijaz Ahmad Khan的answer修改为仅在完全写完行时（行以换行符char结尾）才产生行，这提供了一个无外部依赖的pythonic解决方案：

def follow(file) -> Iterator[str]:
    """ Yield each line from a file as they are written. """
    line = ''
    while True:
        tmp = file.readline()
        if tmp is not None:
            line += tmp
            if line.endswith("\n"):
                yield line
                line = ''
        else:
            time.sleep(0.1)


if __name__ == '__main__':
    for line in follow(open("test.txt", 'r')):
        print(line, end='')

Answer 7

另一个选项是tailhead库，它提供了可以在您自己的模块中使用的tail和head实用程序和API的Python版本。

最初基于tailer模块，其主要优点是能够按路径跟踪文件，即它可以处理重新创建文件时的情况。此外，它还针对各种边缘情况进行了一些错误修复。

Answer 8

你也可以使用＆＃39; AWK＆＃39;命令。
查看更多：http://www.unix.com/shell-programming-scripting/41734-how-print-specific-lines-awk.html
awk可用于尾部最后一行，最后几行或文件中的任何行这可以从python调用。

Answer 9

如果您使用的是Linux，则可以通过以下方式在python中实现非阻塞实现。

import subprocess
subprocess.call('xterm -title log -hold -e \"tail -f filename\"&', shell=True, executable='/bin/csh')
print "Done"

Answer 10

Python是＆＃34;电池包括＆＃34; - 它有一个很好的解决方案：https://pypi.python.org/pypi/pygtail

读取尚未读取的日志文件行。记得上次完成的地方，并从那里继续。

'td:not(a)', 'td:not(.no-select)'

如何在Python中拖尾日志文件？

10 个答案:

非阻止

禁止