如何从最后5行获取匹配错误字符串?

时间:2019-06-11 11:26:10

标签: python

在我的应用程序中,我已经从问题中实现了此脚本:

def should_act():
    errors = ['+CMS ERROR: 8',
        '+CMS ERROR: 28',
        '+CMS ERROR: 29',
        '+CMS ERROR: 50',
        '+CMS ERROR: 226']

    with open("path/to/logfile.log") as f:
        for line in f:
            pass

    return any(error in line for error in errors)

它通过每3秒检测一次错误字典中的错误匹配来工作。但是我才意识到,它只会读取一行,并且当错误字符串不在最后一行时,它不会检测为True。

例如,我有一个日志文件,它是程序的目标:

# This is detected as True
[00:44:28.484] PULL START
[00:44:28.484] +CMS ERROR: 8

# Now it False
[00:44:28.484] PULL START
[00:44:28.484] +CMS ERROR: 8
[00:44:28.484] an empty space / null

我希望它在控制台上从底部开始的5行范围内输出True。已经尝试过return any(error in range(line, 5) for error in errors)方法,但是它给了我一个例外。

有人可以帮忙吗?

更新

可能要比我韧一点点,请随时进行编辑以简化描述。

  

我的程序充当第三方应用程序,负责从日志中搜寻错误密钥并杀死生成错误以防止队列过载的供应商应用程序。

我希望它忽略从底部开始的最后5或20行上方打印的所有错误,以防止终止程序脚本被触发并在重新启动时终止父应用程序(供应商应用程序)。

父应用重新启动后,它会打印出一些起始行,并将最后一行移到上方大约5或20行。如果检测到每个错误,则父应用程序将不会启动,因为我的应用程序会自动终止它们。这就是为什么我需要使它仅在给定范围内检测错误。

以下是日志文件内容的示例,请注意,我在其中留了一些空格,以使你们更容易找到我的虚拟日志行。,只是假装它之间没有空格:< / p>


==== WORKING EXAMPLE ====

[20:05:13.968] PULL START
[20:05:18.968] STAT - UPDATE COUNTER TO SERVER
[20:05:19.218] SEND - URL https://someniceurl.commercial
[20:05:19.468] STAT - RESPONSE = OK FOR URL = https://someniceurl.commercial
[20:05:28.609] PULL RESP NONE
[20:05:28.640] Rx - 
[20:05:28.656] STAT - 68$$"MODEM_DOWN"
[20:05:28.671] SEND - TOP UP RESPONSE, TRANS ID = XXXXXXXX, RESP CODE = 68, MESSAGE = MODEM_DOWN
[20:05:28.687] SEND-->topup?trans_id=XXXXXXXX&trans_dateXXXXXXXX&resp_code=68&ussd_msg=M1%24MODEM%5FDOWN&no_sms=1&smscid=
[20:05:28.703] RESPONSE for: topup?trans_id=XXXXXXXX&trans_dateXXXXXXXX&resp_code=68&ussd_msg=M1%24MODEM%5FDOWN&no_sms=1&smscid= --> 
[20:05:28.718] SEND - URL https://someniceurl.commercial
[20:05:28.734] STAT - RESPONSE = OK;XXXXXXXX FOR URL = https://someniceurl.commercial




[20:06:08.953] A VERY VERY LONG CONTENT HERE - +CMS ERROR: 226 <-- Error with different key




[20:05:28.953] PULL START
[20:05:45.968] PULL RESP NONE
[20:05:48.812] STAT - UPDATE COUNTER TO SERVER
[20:05:48.968] SEND - URL https://someniceurl.commercial
[20:05:49.218] PULL START
[20:05:49.468] STAT - RESPONSE = OK FOR URL = https://someniceurl.commercial
[20:05:55.296] PULL RESP NONE
[20:05:58.953] PULL START
[20:06:07.828] PULL RESP NONE
[20:06:08.953] PULL START


[20:06:08.953] A VERY VERY LONG CONTENT HERE - +CMS ERROR: 8 <-- I put this example error output manually, it works





==== NOT WORKING EXAMPLE ====

[20:05:13.968] PULL START
[20:05:18.968] STAT - UPDATE COUNTER TO SERVER
[20:05:19.218] SEND - URL https://someniceurl.commercial
[20:05:19.468] STAT - RESPONSE = OK FOR URL = https://someniceurl.commercial
[20:05:28.609] PULL RESP NONE
[20:05:28.640] Rx - 
[20:05:28.656] STAT - 68$$"MODEM_DOWN"
[20:05:28.671] SEND - TOP UP RESPONSE, TRANS ID = XXXXXXXX, RESP CODE = 68, MESSAGE = MODEM_DOWN
[20:05:28.687] SEND-->topup?trans_id=XXXXXXXX&trans_dateXXXXXXXX&resp_code=68&ussd_msg=M1%24MODEM%5FDOWN&no_sms=1&smscid=
[20:05:28.703] RESPONSE for: topup?trans_id=XXXXXXXX&trans_dateXXXXXXXX&resp_code=68&ussd_msg=M1%24MODEM%5FDOWN&no_sms=1&smscid= --> 
[20:05:28.718] SEND - URL https://someniceurl.commercial
[20:05:28.734] STAT - RESPONSE = OK;XXXXXXXX FOR URL = https://someniceurl.commercial


[20:06:08.953] A VERY VERY LONG CONTENT HERE - +CMS ERROR: 226 <-- But, it starts to detect this one, and if I remove this line it will detect the other above it. It makes my app executing the terminator script. :(


[20:05:28.953] PULL START
[20:05:45.968] PULL RESP NONE
[20:05:48.812] STAT - UPDATE COUNTER TO SERVER
[20:05:48.968] SEND - URL https://someniceurl.commercial
[20:05:49.218] PULL START



[20:06:08.953] A VERY VERY LONG CONTENT HERE - +CMS ERROR: 8 <-- I moved it here, and it does not work anymore. It is good. :)


[20:05:49.468] STAT - RESPONSE = OK FOR URL = https://someniceurl.commercial
[20:05:55.296] PULL RESP NONE
[20:05:58.953] PULL START
[20:06:07.828] PULL RESP NONE
[20:06:08.953] PULL START

2 个答案:

答案 0 :(得分:2)

使用.readlines()获取行中的文本,用[-5:]对其进行切片以获取最后5行,然后进行迭代。

with open("path/to/logfile.log") as f:
    for line in f.readlines()[-5:]:
        for e in errors:
            if e in line:
                return True

return False

等效地:

with open("path/to/logfile.log") as f:
    return any(e in line for line in f.readlines()[-5:] for e in errors)

正则表达式也可以使用:

import re

re.findall(r'\+CMS ERROR: (8|28|29|50|226)', s)

with open("path/to/logfile.log") as f:
    lines = f.readlines()[-5:]
    return bool(re.findall(r'\+CMS ERROR: (8|28|29|50|226)', '\n'.join(lines)))

re.findall返回匹配列表。对于我们的特定正则表达式,它将返回一个与8、28、29、50或226相匹配的错误的数字列表。将该列表传递给bool将输出一个True / False


您可以对此进行一般化,以通过对变量进行切片来检查最后几行是否有变量。例如

threshold = 15
with open("path/to/logfile.log") as f:
    return any(e in line for line in f.readlines()[-threshold:] for e in errors)

答案 1 :(得分:0)

使用代码,您基本上将f的最后一部分放在一行中。

.k-pdf-export .k-grid td{
  border: 0;
}

尝试像这样修改它:

with open("path/to/logfile.log") as f:
    for line in f:  # <-- this for loop does nothing except line = f[-1]
        pass

return any(error in line for error in errors)

我建议仅在完全理解列表的理解后才使用列表理解。我使用Python已有3年以上了,但仍然避免使用它们。