解析多行日志:行+ xml

时间:2016-01-05 15:45:44

标签: elasticsearch logstash kibana logstash-grok

我试着更好地解释一下: 我有这个多行日志源。 下面我显示的是一个包含3个日志的文件,它以" INFO"开头。并以"< / dialog>":

结束
INFO 05-01-16 08:06:01 [http-nio-8080-exec-8] (AbstractServer.java:454) -
<dialogue>
   <server>FirstLog</server>
   <duration>311</duration>
[...]
</dialogue>
INFO 05-01-16 08:06:02 [http-nio-8080-exec-8] (AbstractServer.java:454) -
<dialogue>
   <server>SecondLog</server>
   <duration>500</duration>
      [...]
</dialogue>
INFO 05-01-16 08:06:03 [http-nio-8080-exec-8] (AbstractServer.java:454) -
<dialogue>
   <server>ThirdLog</server>
   <duration>100</duration>
      [...]
</dialogue> 

我正在使用此过滤器:

if [type] == "oldLogs" {
               multiline {
                  pattern => "^%{LOGLEVEL}"
                  what => "previous"
                  negate => "true"
               }    
               grok {
                   patterns_dir => "./patterns"
                   match => ["message", "(?m)%{LOGLEVEL:level} %{TIMESTAMP_ISO8601:timestamp} \[%{PROG:msg_1}\] \(%{JAVAFILE:file}:%{NUMBER:line}\) \-%{GREEDYDATA:msg_3}"] 
                }   
                xml {
                     store_xml => "false"
                     source => "msg_3"
                     xpath =>[
                     "/dialogue/server/text()"  ,"server",  
                     "/dialogue/duration/text()"    ,"duration",
                     [...]
                  ]
                }
        }

然后我能够解析JAVA日志和xml。 但是使用我的过滤器(上面的帖子),logstash无法确定我的日志的结尾&lt; / dialog&gt ;.

输出如下:

"message" =><dialogue>\n<server>FirstLog</server>\n<duration>311</duration>\n[...]\n</dialogue>\nINFO 05-01-16 08:06:02 [http-nio-8080-exec-8] (AbstractServer.java:454) -\n<dialogue>\n<server>SecondLog</server>\n<duration>500</duration>\n[...]\n</dialogue>\nINFO 05-01-16 08:06:03 [http-nio-8080-exec-8] (AbstractServer.java:454) -\n<dialogue>\n<server>ThirdLog</server>\n<duration>100</duration>\n[...]\n</dialogue>
   "level" => "INFO",
   "timestamp" => "05-01-16 08:06:01",
   "msg_1" => "http-nio-8080-exec-8",
   "file" => "AbstractServer.java",
   "xmldata" => <dialogue>\n<server>FirstLog</server>\n<duration>311</duration>\n[...]\n</dialogue>\nINFO 05-01-16 08:06:02 [http-nio-8080-exec-8] (AbstractServer.java:454) -\n<dialogue>\n<server>SecondLog</server>\n<duration>500</duration>\n[...]\n</dialogue>\nINFO 05-01-16 08:06:03 [http-nio-8080-exec-8] (AbstractServer.java:454) -\n<dialogue>\n<server>ThirdLog</server>\n<duration>100</duration>\n[..]\n</dialogue>
   "server" => [ [0] "FirstLog" ],
   "duration" => [ [0] "311" 

并且logstash仅解析第一个xml日志,而不考虑其他2。 我的最终结果应该是:

{
   "message" => <dialogue>\n<server>FirstLog</server>\n<duration>311</duration>\n[...]\n</dialogue>
   "level" => "INFO",
   "timestamp" => "05-01-16 08:06:01",
   "msg_1" => "http-nio-8080-exec-8",
   "file" => "AbstractServer.java",
   "xmldata" => <dialogue>\n<server>FirstLog</server>\n<duration>311</duration>\n[...]\n</dialogue>
   "server" => [ [0] "FirstLog" ],
   "duration" => [ [0] "311"
}
{
   "message" => <dialogue>\n<server>SecondLog</server>\n<duration>500</duration>\n[...]\n</dialogue>
   "level" => "INFO",
   "timestamp" => "05-01-16 08:06:02",
   "msg_1" => "http-nio-8080-exec-8",
   "file" => "AbstractServer.java",
   "xmldata" =><dialogue>\n<server>SecondLog</server>\n<duration>500</duration>\n[...]\n</dialogue>
   "server" => [ [0] "SecondLog" ],
   "duration" => [ [0] "500"
}
{
   "message" => <dialogue>\n<server>ThirdLog</server>\n<duration>100</duration>\n[...]\n</dialogue>
   "level" => "INFO",
   "timestamp" => "05-01-16 08:06:03",
   "msg_1" => "http-nio-8080-exec-8",
   "file" => "AbstractServer.java",
   "xmldata" => <dialogue>\n<server>ThirdLog</server>\n<duration>100</duration>\n[...]\n</dialogue>
   "server" => [ [0] "ThirdLog" ],
   "duration" => [ [0] "100"
}

我希望这更清楚,有人有时间给我一些提示。

此致

1 个答案:

答案 0 :(得分:0)

首先,您需要将所有行连接到多行(编解码器或过滤器,具体取决于您的需要)。

然后,查看日志级别,日期等,并将xml放在自己的字段中。

最后,在新的xml字段上使用xml {}过滤器。