将几个grok合二为一

时间:2017-05-14 09:24:39

标签: logstash logstash-grok

我有日志条目和grok模式如下。

日志:

2017-04-11 18:31:41,938 | INFO | 195 | Process | Bundle Name | logStr: GUID: dl99X/WeN77E2SmyjH9uS1Fy+EDvFQ5R_939bae | ReferenceID: 20170411183141500676 | InstanceID: 70411183141906430422429270016 | ChannelID: EXAMPLE | System: EXAMPLE | ServiceName: EXAMPLE | InvocationPoint: inbound

2017-04-11 18:31:42,743 | INFO | 193 | API | Bundle Name | Outbound Message | RESPONSE=[GUID=[dl99X/WeN77E2SmyjH9uS1Fy+EDvFQ5R_939bae], InstanceID=[70411183141906430422429270016], logStr=[GUID: dl99X/WeN77E2SmyjH9uS1Fy+EDvFQ5R_939bae | ReferenceID: 20170411183141500676 | InstanceID: 70411183141906430422429270016 | ChannelID: EXAMPLE | System: EXAMPLE | ServiceName: EXAMPLE | InvocationPoint: inbound

Grok模式:

grok {
#grok general pattern
match => {
"message" => "%{TIMESTAMP_ISO8601:logdate}%{SPACE}\|%{SPACE}%{LOGLEVEL:level}%{SPACE}\|%{SPACE}%{DATA:thread}%{SPACE}\|%{SPACE}%{DATA:serviceName}%{SPACE}\|%{SPACE}%{DATA:bundle}%{SPACE}\|%{SPACE}%{GREEDYDATA:logdetails}"
}
}   
#Grok to get GUID
grok {
match => {
"logdetails" => "(?<=GUID:).%{DATA:guid}(?=\s)" 
}
}
#Grok to get ChannelID
grok {
match => {
"logdetails" => "(?<=ChannelID:).%{DATA:channelID}(?=\s)"   
}
}
#Grok to get ReferenceID
grok {
match => {
"logdetails" => "(?<=ReferenceID:).%{DATA:referenceID}(?=\s)"   
}
}

我有几个单独的grok来获取GUID,ChannelID和ReferenceID。 有没有办法将这些组合成一个?

提前谢谢!

1 个答案:

答案 0 :(得分:0)

最好知道你正在处理的日志类型,但是当有太多类型需要担心时我会做什么(但它们保持相同的格式):

  1. 确定基本格式
  2. 将基础之后的所有内容视为“msg”或有效负载。
  3. 使用您要查找的字段解析有效负载。
  4. 您的每封邮件的基本格式为timestamp | loglevel | thread

    LINE %{BASE}\s?\|\s?%{GREEDYDATA:msg}
    
    # Patterns
    BASE %{CUSTTIME:timestamp}\s?\|\s?%{WORD:loglevel}\s?\|\s?%{NONNEGINT:thread}
    CUSTTIME %{YEAR}-%{MONTHNUM}-%{MONTHDAY} %{HOUR}:%{MINUTE}:%{SECOND},%{MILLI}
    MILLI (?:([1-9][0-9]{0,2}|0))
    

    然后你可以在同一个模式文件中添加你要查找的字段的模式,因为所有的值都与key = value非常相似,而是使用冒号:

    COMMAVALUE (\s?(.*?(?=\s\w+:|$))\s?)
    
    # Fields
    GUID GUID:%{COMMAVALUE:guid}
    CHANNELID ChannelID:%{COMMAVALUE:channel_id}
    REFERENCEID ReferenceID:%{COMMAVALUE:reference_id}
    

    因此,您可以使用两个相邻的grok过滤器来执行此操作,一个用于提取msg有效负载,另一个用于从所述消息中提取字段。

    filter {
        grok {
            patterns_dir   => "/etc/logstash/patterns"
            match => { "message" => "%{LINE}" }
        }
        grok {
            patterns_dir => "/etc/logstash/patterns"
            break_on_match => false
            match => [
                "msg", "%{GUID}",
                "msg", "%{CHANNELID}",
                "msg", "%{REFERENCEID}"
            ]
        }
    }
    
    output {
        stdout { codec => "rubydebug" }
    }