我正在尝试使用nxLog解析器to_json()将自定义日志文件解析为JSON,以便我可以将它们发送到我的ElasticSearch实例中。我将把它们分成三个独立的字段,日期,日志类型指示符和消息。
以下是这些日志的格式。
9/10/2015 11:30:05 AM [0-1-1-Pos.xaml.cs-1607] Post button clicked
9/10/2015 11:30:17 AM [0-3-1-SecondaryPortStatus.cs-47] <TRANSACTION>
<FUNCTION_TYPE>SECONDARYPORT</FUNCTION_TYPE>
<COMMAND>STATUS</COMMAND>
<MAC_LABEL>XX</MAC_LABEL>
<MAC>xOel7QeyKoXaddiyrEeWKRI1DlF9sHzUNfZHFI/gAko=</MAC>
<COUNTER>XXX</COUNTER>
</TRANSACTION>
9/10/2015 11:30:17 AM [0-3-1-SecondaryPortStatus.cs-57] <RESPONSE>
<RESPONSE_TEXT>Operation SUCCESSFUL</RESPONSE_TEXT>
<RESULT>OK</RESULT>
<RESULT_CODE>-1</RESULT_CODE>
<TERMINATION_STATUS>SUCCESS</TERMINATION_STATUS>
<COUNTER>221</COUNTER>
<SECONDARY_DATA>12</SECONDARY_DATA>
<MACLABEL_IN_SESSION>P_061</MACLABEL_IN_SESSION>
<SESSION_DURATION>00:00:16</SESSION_DURATION>
<INVOICE_SESSION>XX</INVOICE_SESSION>
<SERIAL_NUMBER>XX</SERIAL_NUMBER>
</RESPONSE>`
我已经能够使用PERL正则表达式语法解析日期戳和错误选择器(括号内的所有内容),如下所示。
1. ^(\d\d|\d)/(\d\d|\d)/(\d\d\d\d)\s(\d\d|\d):(\d\d|\d):(\d\d|\d)\s(AM|PM)
2. \[(.*)\]
但我无法弄清楚如何在选择器和新线之间拉出所有东西。所以在这个例子中,我希望我的消息是新行之前的XML代码。有没有人有关于如何检索数据的建议?
答案 0 :(得分:1)
您应该能够使用nxlog的 xm_multiline 模块,并在 HeaderLine 指令中指定正则表达式。 如果您将一个捕获规则添加到regexp以匹配XML部分( [...] 之后的东西),那么您应该能够使用xm_xml的parse_xml()解析XML。
有一个类似的例子here。
答案 1 :(得分:0)
尝试使用多行ReGex:
$ perl -0777 -ne 'print $& if !<RESPONSE>.*</RESPONSE>!s' file
将输入/输出分隔符设置为undef
( - 0777)会将整个文件粘贴到内存中
<RESPONSE>
<RESPONSE_TEXT>Operation SUCCESSFUL</RESPONSE_TEXT>
<RESULT>OK</RESULT>
<RESULT_CODE>-1</RESULT_CODE>
<TERMINATION_STATUS>SUCCESS</TERMINATION_STATUS>
<COUNTER>221</COUNTER>
<SECONDARY_DATA>12</SECONDARY_DATA>
<MACLABEL_IN_SESSION>P_061</MACLABEL_IN_SESSION>
<SESSION_DURATION>00:00:16</SESSION_DURATION>
<INVOICE_SESSION>XX</INVOICE_SESSION>
<SERIAL_NUMBER>XX</SERIAL_NUMBER>
</RESPONSE>
BEGIN { $/ = undef; $\ = undef; } # input/output separator as undef
while (defined($_ = <ARGV>)) {
print $& if m[<RESPONSE>.*</RESPONSE>]s;
}
来自perldoc perlre
的 s Treat string as single line. That is, change "." to match any
character whatsoever, even a newline, which normally it would not
match.