Question

新手警报！

我有以下grok过滤器：

filter {

        grok {
                match   =>  [ "message","%{DATESTAMP:timestamp}" ]
                match   =>  [ "message", "(?<number_after_timestamp>[0-9]{8}\s\w+)"]
                match   =>  [ "message", "(?<error_or_debug>(ERROR|DEBUG))"]
                match   =>  [ "message", "(?<first_part>ORB\.thread\.pool.*(?=\s{2}))" ]
                match   =>  [ "message", "(?<exception_class_name>(?<=\<Exception class name\=\s).*?\>)" ]
                match   =>  [ "message", "(?<exception_message>(?<=\<Exception message\=).*?(?=\>))" ]

        }   


}

单独使用grok调试器测试时，每个模式都与我需要的文本块完全匹配。在grok调试器中，命名的模式名称用作字段并且发出很好。但是，当我在grok调试器中使用的相同日志事件上运行时，不会发出日志事件行或字段名称中的数据。

例如，异常类名称模式产生：

{
    "exception_class_name": [
    [
      "com.ultatica.bd.exceptions.TTFException>"
    ]
   ]
}

但是当针对来自logstash命令行的数据运行时 - ＆gt;不是香肠！

非常感谢任何帮助。

由于

日志文件是这样的：

[30/09/14 23:07:15:195 BST] 00000043 SystemOut O ERROR 32109 Tue Sep 30 23:07:15 BST 2014 ORB.thread.pool : 2 webuser com.ultra.bd.services.UltraticoCustomerService.processRequest API getPerson  <Exception class name= com.Ultratico.bd.exceptions.UCOException> <Exception message= e05CX432182S> <UCOException Error = 32109>

[30/09/14 23:07:15:200 BST] 00000043 SystemOut O ERROR 32109 Tue Sep 30 23:07:15 BST 2014 ORB.thread.pool : 2 webuser com.Ultratico.ecrm.framework.sessionHandler.UltraticoSessionHandler.execute  <Exception class name= com.Ultratico.bd.exceptions.UCOException> <Exception message= e05CX432182S> <UCOException Error = 32109>

Answer 1

查看输入，我尝试解析它的方式是使用单个匹配表达式。在logstash中，我使用多个匹配模式作为解析不同类型的日志条目的方法。 e.g。

匹配模式1 =＆gt;没有
匹配模式2 =＆gt; YES
匹配模式3 =＆gt; NO

因此，对于您的示例，我会执行以下操作：

filter {
    grok {      
        break_on_match => false
        match => [ "message", "%{SYSLOG5424SD:timestamp} %{NUMBER:number_after_timestamp} (?<forget1>.*) (?<error_or_debug>ERROR|DEBUG) %{NUMBER:process_id} (?<timestamp_2>.{7} \d{2} \d{2}:\d{2}:\d{2} \w{3} \d{4}) %{JAVACLASS:origin} : (?<first_part>.*) %{JAVACLASS:exception_class_name} (?<exception_message>.*)" ]
        match => [ "message", "..some other pattern you want to extract.." ]
        match => [ "message", "..some other pattern you want to extract.." ]            
    }
}

可以整理，但你得到了要点..

Answer 2

grok的语法是match => [ "field", "pattern1", "pattern2", "pattern3",...,"patternN"]。多个match参数不起作用，因为它将它们加载到哈希中 - 导致它只使用最后一个。

您需要创建多个grok块，每个break_on_match => false块以_grokparsefailure的方式创建，但如果您使用第一个表单并使用完整模式匹配整行，则会更好这样你就可以避免不可避免的{{1}} s。

Logstash grok自定义模式不产生任何字段

2 个答案: