我有一个如下日志:
事件:“[INIT] WinEvtLog:安全:AUDIT_SUCCESS(528):安全:管理员:AMAZON-D071A6F8:AMAZON-D071A6F8:成功登录:用户名:管理员域:AMAZON-D071A6F8登录ID:(0x0,0x1054A66)登录类型:10登录过程:User32身份验证包:协商工作站名称:AMAZON-D071A6F8登录GUID: - 来电者用户名:AMAZON-D071A6F8 $来电域:WORKGROUP来电者登录ID:(0x0,0x3E7)来电进程ID:968过境服务: - 源网络地址:10.0.0.200源端口:60054 [END]“;
我用这个正则表达式捕获日志:
EVENT:\s\"\[INIT\](?P<log>.*?)\[END\]\";
我这样做是因为我想稍后显示整个EVENT
。
(?P<log>)
里面还有一些我想抓的东西。例如,
Source\sPort:\s(?P<src_port>\d+)
Source\sNetwork\sAddress:\s(?P<src_network_addr>\S+)
以及EVENT
中的其他内容。
我不确定如何创建正则表达式以便能够抓取整个EVENT
以及EVENT
内的位。
答案 0 :(得分:2)
捕获另一个捕获组内的组,
EVENT:\s\"\[INIT\](?P<log>.*?Source\sNetwork\sAddress:\s(?P<src_network_addr>\S+).*?Source\sPort:\s(?P<src_port>\d+).*?)\[END\]\"
上述正则表达式会捕获log
以及src_port
中出现的src_network_addr
和log
。
答案 1 :(得分:1)
下面列出的正则表达式将匹配任何以EVENT: "[INIT]
开头并结束[END]";
的事件日志。如果任何感兴趣的短语都在事件日志中,则会记录它们。
请注意嵌套捕获组的使用:(?P<log>...(?P<src_port>...)...)
。外部小组将捕捉其整个模式,包括内部小组捕获的任何内容。
另请注意,未参与匹配的任何群组仍会显示在结果dict
中,其值为None
。
import re
from pprint import pprint
texts=[
'EVENT: "[INIT]WinEvtLog: Security: AUDIT_SUCCESS(528): Security: Administrator: AMAZON-D071A6F8: AMAZON-D071A6F8: Successful Logon: User Name: Administrator Domain: AMAZON-D071A6F8 Logon ID: (0x0,0x1054A66) Logon Type: 10 Logon Process: User32 Authentication Package: Negotiate Workstation Name: AMAZON-D071A6F8 Logon GUID: - Caller User Name: AMAZON-D071A6F8$ Caller Domain: WORKGROUP Caller Logon ID: (0x0,0x3E7) Caller Process ID: 968 Transited Services: - Source Network Address: 10.0.0.200 Source Port: 60054 [END]";',
'EVENT: "[INIT]Random text with one match Source Port: 60054 And stuff at end [END]";',
'EVENT: "[INIT]Random text with no matches [END]";']
for text in texts:
match = re.match(
r'''
(?x) # Verbose
EVENT:\s"\[INIT] # anchor from beginning
(?P<log> # record entire entry
(?: # consisting of:
(?:Source\sNetwork\sAddress:\s # src_network_address
(?P<src_network_address>\S+))
| # OR
(?:Source\sPort:\s # src_port
(?P<src_port>\S+))
| # OR
.*? # anything else
)* # as many times as required
)
\s\[END]";$ # anchor at end
''',
text)
if(match):
pprint (match.groupdict())
结果:
{'log': 'WinEvtLog: Security: AUDIT_SUCCESS(528): Security: Administrator: AMAZON-D071A6F8: AMAZON-D071A6F8: Successful Logon: User Name: Administrator Domain: AMAZON-D071A6F8 Logon ID: (0x0,0x1054A66) Logon Type: 10 Logon Process: User32 Authentication Package: Negotiate Workstation Name: AMAZON-D071A6F8 Logon GUID: - Caller User Name: AMAZON-D071A6F8$ Caller Domain: WORKGROUP Caller Logon ID: (0x0,0x3E7) Caller Process ID: 968 Transited Services: - Source Network Address: 10.0.0.200 Source Port: 60054',
'src_network_address': '10.0.0.200',
'src_port': '60054'}
{'log': 'Random text with one match Source Port: 60054 And stuff at end',
'src_network_address': None,
'src_port': '60054'}
{'log': 'Random text with no matches',
'src_network_address': None,
'src_port': None}