正确的ELK多线正则表达式?

时间:2017-08-23 10:34:26

标签: regex logstash multiline logstash-grok logstash-configuration

我是ELK的新手,我正在编写一个使用多行的配置文件,我们需要为输入数据编写一个模式

110000|read|<soapenv:Envelope>
<head>hello<head>
<body></body>
</soapenv:Envelope>|<soapenv:Envelope>
<body></body>
</soapenv:Envelope>
210000|read|<soapenv:Envelope>
<head>hello<head>
<body></body>
</soapenv:Envelope>|<soapenv:Envelope>
<body></body>
</soapenv:Envelope>
370000|read|<soapenv:Envelope>
<head>hello<head>
<body></body>
</soapenv:Envelope>|<soapenv:Envelope>
<body></body>
</soapenv:Envelope>

和使用的配置文件是:

input {
  file {
    path => "/opt/test5/practice_new/xml_input.dat"
     start_position => "beginning"
        codec => multiline
  {
   pattern => "^%{INT}\|%{WORD}\|<soapenv:Envelope*>\|<soapenv"
   negate => true
   what => "previous"
  }
  }
}
filter {
  grok {
    match => [ "message", "%{DATA:method_id}\|%{WORD:method_type}\|%{GREEDYDATA:request}\|%{GREEDYDATA:response}" ]
  }
}

output {
   elasticsearch {
     hosts => "http://localhost:9200"
     index => "xml"
  }
stdout {}
}

但其中使用的模式与我的要求不符。

请建议我正确的模式。

预期产出:

第一次登录

method_id- 110000

method type-

request-

response-

第二次登录

 method id- 210000

    method type-

    request-

    response-

同样适用于其他人。

1 个答案:

答案 0 :(得分:0)

首先,您必须修复多线模式:

codec => multiline {
            pattern => "^%{NUMBER:method_id}\|%{DATA:method_type}\|<soapenv:Envelope>"
            negate => true
            what => previous
        }

之后您可以使用Wiktor在评论中建议的模式:

(?m)^(?<method_id>\d+)\|(?<method_type>\w+)\|(?<request><soapenv:Envelope>.*?</soapenv:Envelope>)\|(?<response><soapenv:Envelope>.*?</soapenv:Envelope>)

http://grokconstructor.appspot.com上的帖子中显示三个日志行的结果: results

您的整个配置可能如下所示:

input {
  file {
    path => "/opt/test5/practice_new/xml_input.dat"
    start_position => "beginning"
    codec => multiline {
            pattern => "^%{NUMBER:method_id}\|%{DATA:method_type}\|<soapenv:Envelope>"
            negate => true
            what => previous
        }
  }
}
filter {
  grok {
    match => [ "message", "(?m)^(?<method_id>\d+)\|(?<method_type>\w+)\|(?<request><soapenv:Envelope>.*?</soapenv:Envelope>)\|(?<response><soapenv:Envelope>.*?</soapenv:Envelope>)" ]
  }
}

output {
   elasticsearch {
     hosts => "http://localhost:9200"
     index => "xml"
  }
stdout {}
}