从Splunk中包含字符串的日志文件中提取5个字段

时间:2018-08-22 13:27:05

标签: regex splunk regex-group

以下是示例日志文件数据:

08/22/2018 02:50:06.380 EDT-0400 2 TCP/IP Controller Plugin.Transmitter pool thread <Regular:2>.CybTargetHandlerChannel.call[:695] - Message has been sent: 20180822 02500636+0400 C7STA PLINUX03 ALOPMTA2.N01834/LO.S00001D182340248/MAIN State EXEC SetStart Status(Executing at PLINUX03) Jobno(34523) ChildPid(34527)  User(PLINUX03) Host(localhost)
08/22/2018 02:50:06.382 EDT-0400 5 TCP/IP Controller Plugin.Transmitter pool thread <Regular:2>.CybTargetHandlerChannelLogHelper.logConnectionClose[:133] - Conversation with C7STA closed
08/22/2018 02:51:21.761 EDT-0400 5 TCP/IP Controller Plugin.Transmitter pool thread <Regular:1>.CybTargetHandlerChannel.call[:666] - Attempting to send:    20180822 02512176+0400 C7STA PLINUX03 ALOECPC7.N01745/LO.S00002D182340242/MAIN State COMPLETE Cmpc(0) SetEnd  User(PLINUX03) Host(localhost)
08/22/2018 02:51:21.771 EDT-0400 2 TCP/IP Controller Plugin.Transmitter pool thread <Regular:1>.CybTargetHandlerChannel.call[:695] - Message has been sent: 20180822 02512176+0400 C7STA PLINUX03 ALOECPC7.N01745/LO.S00002D182340242/MAIN State COMPLETE Cmpc(0) SetEnd  User(PLINUX03) Host(localhost)

我试图从包含“消息已发送”的第一行和第四行中提取以下五个字段:

  1. 时间戳记:20180822 02500636 + 0400,20180822 02512176 + 0400
  2. 职位名称:ALOPMTA2,ALOECPC7
  3. 职位编号:01834,1745
  4. 用户:用户(PLINUX03),用户(PLINUX03)
  5. 状态:MAIN State EXEC SetStart,MAIN State COMPLETE

我能够使用以下表达式过滤包含“消息已发送:”的行,但不确定从该行提取5个字段:

^.*\b(Message has been sent:.)\b.*$

有人可以帮忙吗?这是在Splunk上提取的。谢谢!

1 个答案:

答案 0 :(得分:1)

我建议您使用此正则表达式:

Message has been sent: (?<timestamp>\d{8}\s\d{8}\+\d{4})\s\w+\s\w+\s(?<jobname>\w+)\.N(?<jobnumber>\d+)\/[^\/]+\/(?<statuses>(\w+\s)+)\w+\(.+User\((?<user>\w+)\)
  • 组“时间戳” (\d{8}\s\d{8}\+\d{4}):与时间戳匹配
  • 组“ jobname” \s(\w+)\.N:与工作名称匹配
  • “职位编号” \.N(\d+)\/组:匹配职位编号
  • 组“状态” ((\w+\s)+):与状态匹配
  • “用户”组User\((\w+)\):与用户匹配

您可以在此处看到包含您提供的数据的示例:https://regex101.com/r/G6GD46/4

不要犹豫,使用此示例来获得所需的结果。

请告诉我有关这些正则表达式的更多信息。

编辑:如@RichG在评论中所建议,我添加了命名组,以允许Splunk将组提取为变量。