我如何通过小样日志文件中的GROK获取不同类型的消息?

时间:2019-06-26 13:06:03

标签: design-patterns logstash elastic-stack logstash-grok

我正在使用Logstash,我需要格式化我的日志文件以拆分每个字段。

日志具有下一种格式:

info: ::ffff:127.0.0.1 - ::ffff:127.0.0.1 [26/Jun/2019:11:52:36 +0000] "OPTIONS /api/categories/5ced18e2a0c9a01e879ce704 HTTP/1.1" 200 19 - 0.652

info: ::ffff:127.0.0.1 - - [26/Jun/2019:11:52:36 +0000] "GET /api/categories/5ced18e2a0c9a01e879ce704 HTTP/1.1" 304 - - 12.156

info: ::ffff:127.0.0.1 ::ffff:127.0.0.1 - [26/Jun/2019:11:52:36 +0000] "OPTIONS /api/twitter/5ced18e2a0c9a01e879ce704/1561463556535-1561549956535?from=0&size=10&orderType=desc&order=date&aggregations=true&timeSeriesInterval=1h HTTP/1.1" 200 8 - 0.874

error: ::ffff:127.0.0.1 ::ffff:127.0.0.1 ::ffff:127.0.0.1 [26/Jun/2019:11:52:36 +0000] "GET /api/twitter/5ced18e2a0c9a01e879ce704/1561463556535-1561549956535?from=0&size=10&orderType=desc&order=date&aggregations=true&timeSeriesInterval=1h HTTP/1.1" 400 43 - 9.044

这是我在Logstash中为其提供的过滤器:

filter {
    grok {
      match => { "message" => "%{WORD:type}: %{IP:ipclient} - %{IP:ipuser} [%{HTTPDATE:datetime}] \"%{WORD:method} %{URIPATHPARAM:request} %{WORD:httpversion}\" %{WORD:status} %{NUMBER:bytes} - %{NUMBER:responsetime}" }
      add_field => [ "received_at", "%{@timestamp}" ]
      add_field => [ "received_from", "%{host}" ]
    }
    date {
      match => [ "timestamp" , "dd/MM/yyyy - HH:mm:ss" ]
    }
  }

但是此过滤器无法识别我的日志结构。

您知道哪种模式可以解决问题吗?

谢谢。

1 个答案:

答案 0 :(得分:0)

尝试以下模式:

%{WORD:type}: %{IP:ipclient} - %{IP:ipuser} \[%{HTTPDATE:datetime}\] \"%{WORD:method} %{URIPATHPARAM:request} %{WORD}\/%{NUMBER:httpversion}\" %{WORD:status} %{NUMBER:bytes} - %{NUMBER:responsetime}

  1. 转义[]
  2. 如果我没记错的话,
  3. /不在WORD中。另外,我只提取版本并删除http/部分

通常,在Kibana中的Grok调试器将是您最好的调试伙伴:

enter image description here