Python尾部-f日志文件不断

时间:2016-11-10 04:16:51

标签: python linux tail logfile

我使用以下代码片段实现了python tail -f,当我的程序在后台持续运行 python myprogram.py&

时,它完全正常工作
def follow(thefile):
    thefile.seek(0,2)
    while True:
        line = thefile.readline()
        if not line:
            time.sleep(0.1)
            continue
        yield line

传递给上述函数的文件是一个日志文件,它从main

传递
    # follow.py

    # Follow a file like tail -f.

import smtplib
import time
import re
import logging

# Here are the email package modules we'll need
from email.mime.image import MIMEImage
from email.mime.multipart import MIMEMultipart
from email.mime.text import MIMEText
from job import Job


def follow(thefile):
    thefile.seek(0,2)
    while True:
        line = thefile.readline()
        if not line:
            time.sleep(0.1)
            continue
        yield line

def sendMail(job,occurtime):
    COMMASPACE = ', '
    outer =  MIMEMultipart()
    # msg = MIMEMultipart('alternative')
    outer['Subject'] = 'ETL Failed for job:' + job
    outer['From'] = 'eltmonitor@fms.com'
    me=  'eltmonitor@ncellfms.com'
    family = ["bibesh.pokhrel@huawei.com"]
    outer['To'] = COMMASPACE.join(family)

    inner = MIMEMultipart('alternative')
    html = """\
        <html>
          <head></head>
          <body>
            <p>Dears,<br>
               Please take necessary action to troubleshoot the ETL Error for job:""" + job + " at " + occurtime + """
            </p>
          </body>
        </html>
        """

# Record the MIME types of both parts - text/plain and text/html.
    part2 = MIMEText(html, 'html')

# Attach parts into message container.
# According to RFC 2046, the last part of a multipart message, in this case
# the HTML message, is best and preferred.
    inner.attach(part2)
    outer.attach(inner)

# Connect to SMTP server and send the email
# Parameter are from=me, to=family and outer object as string for message body
    try:
        s = smtplib.SMTP('localhost')
        s.sendmail(me,family,outer.as_string())
        s.quit()
    except SMTPException:
        logging.info('Unable to send email')

if __name__ == '__main__':
    while True:
        logging.basicConfig(filename='/opt/etlmonitor/monitor.log',format='%(asctime)s %(levelname)s %(message)s',level=logging.DEBUG, filemode='w')
# Define two ETL Job object to store the state of email sent as boolean flag
        fm =Job()
        ncell =Job()
        try:
            with open("/opt/report/logs/GraphLog.log","r") as logfile:
            # Continually read the log files line by line

                loglines = follow(logfile)

            # Do something with the line
                for line in loglines:
            # Extract the last word in the line of log file
            # We are particulary looking for SUCCESS or FAILED word
            # Warning!! leading whitespace character is also matched
                    etlmsg= re.search(".*(\s(\w+)$)",line)
                    if etlmsg:
            # Remove leading whitespace
                        foundmsg = etlmsg.group(1).lstrip()
            # Process on the basis of last word
            # If it is SUCCESS , set the job mailsent flag to False so that no email is sent
            # If it is FAILED and mailsent flag of job is False, send a email and set mailsent flag to True
            # If it is FAILED and mailsent flag of job is True, do nothing as email was already sent
                        if foundmsg=='SUCCESS':
                            jobname= re.search(": Graph '(.+?)\'",line)
                            if jobname:
                                foundjob= jobname.group(1)
                                if foundjob =='Mirror.kjb':
                                    logging.info('Ncell Mirror job detected SUCCESS')
                                    ncell.p == False
                                elif foundjob =='FM_job.kjb':
                                    fm.p == False
                                    logging.info('Ncell Report job detected SUCCESS')
                                else:
                                    logging.info('No job name defined for success message')

                        elif foundmsg =='FAILED':
                            jobname= re.search(": Graph '(.+?)\'",line)
                            timevalue=re.search("(.+?)\,",line)
                            if jobname and timevalue:
                                foundjob= jobname.group(1)
                                foundtime = timevalue.group(1)
                                if foundjob =='Mirror.kjb':
                                    if ncell.p == True:
                                        logging.info('Notification Email has been already sent for job: ' + foundjob)
                                    elif ncell.p == False :
                                        ncell.p = True
                                        sendMail(foundjob,foundtime)
                                    else:
                                        logging.info("state not defined")
                                elif foundjob =="FM_job.kjb":
                                    if fm.p == True:
                                        logging.info('Notification Email has been already sent for job: ' + foundjob)
                                    elif fm.p == False:
                                        fm.p = True
                                        sendMail(foundjob,foundtime)
                                    else:
                                        logging.info('Unkown state of job')
                                else:
                                    logging.info('New job name found')

        except IOError:
            logging.info('Log file could not be found or opened')

我实际上正在使用该行读取正则表达式行中的最后一个单词,并根据收到的最后一个单词执行一些任务。

问题是,日志文件(GraphLog.log)正在根据文件大小进行滚动。发生这种情况时,我的程序也会停止。 即使在按文件大小和日期滚动日志文件之后,如何在不终止程序(没有出现错误)的情况下不断读取GraphLog.log文件。

非常感谢任何帮助。

1 个答案:

答案 0 :(得分:0)

旋转文件(如您所说的那样“滚动”)时,正在读取的文件将被重命名或删除,并在其位置创建另一个文件。您的读操作仍然转到原始文件。 (如果文件被删除,则其内容将保留在原处,直到您将其关闭。)因此,您需要定期(例如在follow循环中)检查os.stat(filename).st_ino的返回值。如果已更改,则需要关闭当前文件,再次将其重新打开,然后从头开始读取。

请注意,有一些方法可以通过OS的事件机制更有效地执行此操作,而无需定期轮询。参见例如watchdogs API