如何在两个模式之间找到输出线(它们是相同的)

时间:2018-11-26 20:21:02

标签: python regex

我正在尝试解析一些日志文件,该文件的每一行都以时间戳开头,例如:

  

[11/16/18 16:40:04:097 EST]

如果日志中没有任何错误,则每一行将具有相同的起始模式。但是,如果发生某些错误,则将使用以下时间戳打印整个错误堆栈:

[11/16/18 16:40:04:100 EST] 000000ae CommerceSrvr  E MessagingViewCommandImpl nonHttpForwardDocument(String,String) CMN8014E: The URL constructed during composition using ViewName 
Additional Data: 
    null
Current exception:
Message:
_ERR_BSAFE_FUNCTION
Stack trace:

我想做的是额外增加整个错误堆栈,例如,输入是:

[11/16/18 16:40:04:098 EST] 000000ae CommandLogger 2   PerfLog <entry operation="Command : com.ibm.commerce.messaging.viewcommands.MessagingViewCommandImpl" parameters="@releaseID=9.0 
[11/16/18 16:40:04:100 EST] 000000ae CommerceSrvr  E MessagingViewCommandImpl nonHttpForwardDocument(String,String) CMN8014E: The URL constructed during composition using ViewName 
Additional Data: 
    null
Current exception:
Message:
_ERR_BSAFE_FUNCTION
Stack trace:
[11/16/18 16:40:04:101 EST] 000000ae SystemErr     R   
[11/16/18 16:40:04:102 EST] 000000ae SystemErr     R   com.ibm.commerce.exception.ECSystemException: The URL constructed during composition using ViewName http://localhost:80/webapp/wcs/stores/IBM.WC.Compose/webservices/OAGIS/9.0/BODs/AcknowledgePaymentInstruction.jsp/******** is invalid {1}.
    at com.ibm.commerce.messaging.viewcommands.MessagingViewCommandImpl.nonHttpForwardDocument(MessagingViewCommandImpl.java:581)

理想的输出应该是:

[11/16/18 16:40:04:100 EST] 000000ae CommerceSrvr  E MessagingViewCommandImpl nonHttpForwardDocument(String,String) CMN8014E: The URL constructed during composition using ViewName 
Additional Data: 
    null
Current exception:
Message:
_ERR_BSAFE_FUNCTION
Stack trace: 
[11/16/18 16:40:04:102 EST] 000000ae SystemErr     R   com.ibm.commerce.exception.ECSystemException: The URL constructed during composition using ViewName http://localhost:80/webapp/wcs/stores/IBM.WC.Compose/webservices/OAGIS/9.0/BODs/AcknowledgePaymentInstruction.jsp/******** is invalid {1}.
    at com.ibm.commerce.messaging.viewcommands.MessagingViewCommandImpl.nonHttpForwardDocument(MessagingViewCommandImpl.java:581)

尝试以下操作并失败,如果您可以让我知道我的代码有什么问题,那就太好了。

import re, sys

if len(sys.argv) > 1:
    with open(sys.argv[1]) as f:
        text = f.read()
else:
    text = sys.stdin.read()

p_start = r'^\[\d{2}/.*'
p_end = r'^\[\d{2}/.*'


pattern = r'{p0}(?!.*{p0})(?:.*?{p1}|.*)'.format(p0=p_start, p1=p_end)

error_no_match = 'No Match'

matches = re.findall(pattern, text, flags=re.M|re.DOTALL)

if matches:
    for match in matches:
        print 'match:', match
    print len(matches)
else:
    print error_no_match 

1 个答案:

答案 0 :(得分:2)

当您将整个文件读入变量text时,可以使用

matches = re.findall(r'^\[\d{2}/.*(?:\n(?!\[\d{2}/).*)+', text, re.M)

请参见regex demo。请注意,如果您的文本包含CRLF结尾,则需要将\n替换为\r?\n(其中CR是可选的)。

详细信息

  • re.M修饰符使^在行首匹配
  • ^-一行的开头
  • \[-一个[字符
  • \d{2}/-2位数字和一个/字符
  • .*-该行的其余部分
  • (?:\n(?!\[\d{2}/).*)+-重复一个或多个
    • \n(?!\[\d{2}/)-一个LF符号(如果可以有CRLF结尾,请使用\r?\n),其后没有[和两位数字以及/
  • .*-该行的其余部分。

Python demo:

import re
rx = r"^\[\d{2}/.*(?:\n(?!\[\d{2}/).*)+"
text = "[11/16/18 16:40:04:098 EST] 000000ae CommandLogger 2   PerfLog <entry operation=\"Command : com.ibm.commerce.messaging.viewcommands.MessagingViewCommandImpl\" parameters=\"@releaseID=9.0 \n[11/16/18 16:40:04:100 EST] 000000ae CommerceSrvr  E MessagingViewCommandImpl nonHttpForwardDocument(String,String) CMN8014E: The URL constructed during composition using ViewName \nAdditional Data: \n    null\nCurrent exception:\nMessage:\n_ERR_BSAFE_FUNCTION\nStack trace:\n[11/16/18 16:40:04:101 EST] 000000ae SystemErr     R   \n[11/16/18 16:40:04:102 EST] 000000ae SystemErr     R   com.ibm.commerce.exception.ECSystemException: The URL constructed during composition using ViewName http://localhost:80/webapp/wcs/stores/IBM.WC.Compose/webservices/OAGIS/9.0/BODs/AcknowledgePaymentInstruction.jsp/******** is invalid {1}.\n    at com.ibm.commerce.messaging.viewcommands.MessagingViewCommandImpl.nonHttpForwardDocument(MessagingViewCommandImpl.java:581)"
matches = re.findall(rx, text, re.M)
print(matches)

输出:

[
  '[11/16/18 16:40:04:100 EST] 000000ae CommerceSrvr  E MessagingViewCommandImpl nonHttpForwardDocument(String,String) CMN8014E: The URL constructed during composition using ViewName \nAdditional Data: \n    null\nCurrent exception:\nMessage:\n_ERR_BSAFE_FUNCTION\nStack trace:', 
  '[11/16/18 16:40:04:102 EST] 000000ae SystemErr     R   com.ibm.commerce.exception.ECSystemException: The URL constructed during composition using ViewName http://localhost:80/webapp/wcs/stores/IBM.WC.Compose/webservices/OAGIS/9.0/BODs/AcknowledgePaymentInstruction.jsp/******** is invalid {1}.\n    at com.ibm.commerce.messaging.viewcommands.MessagingViewCommandImpl.nonHttpForwardDocument(MessagingViewCommandImpl.java:581)'
]