读取文件十秒钟

时间:2013-10-24 06:24:48

标签: python logging

我正在使用Logfiles。我需要的是我想在一段指定的时间内逐行读取文件,比如说10秒。如果有办法在Python中实现这一点,有人可以帮助我吗?

2 个答案:

答案 0 :(得分:1)

使用tail运行tacPopen并迭代输出,直到找到要停止的行。这是一个示例代码段。

filename = '/var/log/nginx/access.log'
# Command to read file from the end
cmd = sys.platform == 'darwin' and ['tail', '-r', filename] or ['tac', filename]
# But if you want read it from beginning, use the following
#cmd = ['cat', filename]

proc = Popen(cmd, close_fds=True, stdout=PIPE, stderr=PIPE)
output = proc.stdout

FORMAT = [
    # 'foo',
    # 'bar',
]
def extract_log_data(line):
    '''Extact data in you log format, normalize it.
    '''
    return dict(zip(FORMAT, line))

csv.register_dialect('nginx', delimiter=' ', quoting=csv.QUOTE_MINIMAL)
lines = csv.reader(output, dialect='nginx')
started_at = dt.datetime.utcnow()
for line in lines:
    data = extract_log_data(line)
    print data
    if (dt.datetime.utcnow() - started_at) >= dt.timedelta(seconds=10):
        break

output.close()
proc.terminate()

答案 1 :(得分:1)

代码

from multiprocessing import Process
import time

def read_file(path):
    try:
        # open file for writing
        f = open(path, "r")
        try:
            for line in f:
                # do something
                pass

        # always close the file when leaving the try block 
        finally:
            f.close()

    except IOError:
        print "Failed to open/read from file '%s'" % (path)

def read_file_limited_time(path, max_seconds):

    # init Process
    p = Process(target=read_file, args=(path,))

    # start process
    p.start()

    # for max seconds 
    for i in range(max_seconds):

        # sleep for 1 seconds (you may change the sleep time to suit your needs)
        time.sleep(1)

        # if process is not alive, we can break the loop
        if not p.is_alive():
            break

    # if process is still alive after max_seconds, kiil it!
    if p.is_alive():
        p.terminate()

def main():
    path = "f1.txt"
    read_file_limited_time(path,10)

if __name__ == "__main__":
    main()

备注

我们每隔1秒钟“醒来”并检查我们开始的过程是否仍然存在的原因只是为了防止我们在过程结束时保持睡眠状态。如果过程在1秒后结束,则浪费时间浪费9秒钟。