自定义GROK过滤器 - Logstash - > Elasticsearch

时间:2018-01-30 12:59:33

标签: regex elasticsearch logstash logstash-grok

我有一个日志被捕获并发送到logstash,日志的格式是

22304999    5   400.OUTPUT_SERVICE.510  submit  The limit has been exceeded. Please use a different option. 2.54.44.221 /api/output/v3/contract/:PCID/order /api/output/v3/contract/:pcid/order https://www.example.org/output/ PUT 400 2017-09-28T15:50:57.843176Z

我正在尝试创建一个自定义grok过滤器,以便在将头字段发送到elasticsearch之前添加它们。

我的目标是这样的,

 SessionID   => "22304999"
 HitNumber   => "5"
 FactValue   => "400.OUTPUT_SERVICE.510"
 DimValue1   => "submit"
 ErrMessage  => "The limit has been exceeded. Please use a different option."
 IP          => "2.54.44.221"
 TLT_URL     => "/api/output/v3/contract/:PCID/order"
 URL         => "/api/output/v3/contract/:pcid/order"
 Refferer    => "https://www.example.org/output/"
 Method      => "PUT"
 StatsCode   => "400"
 ReqTime     => "2017-09-28T15:50:57.843176Z"

我是新手,所以只是试图理解我如何应用并测试它,例如我会从一个空滤镜开始,

filter {
   grok {
     match => { "message" => "" }
   }
 }

我的第一个问题是match => { "message" => "" }消息只是一个日志行吗?什么定义'消息'?

我的日志和我想要的字段用Tab分隔,在每个Tab之后它是一个新字段,这会使我想要实现的更容易,而不是寻找模式我可以只查找下一个标签吗?

如果做不到这一点,有人可以为我的一个领域提供一个例子,我应该可以完成其余的工作。

2 个答案:

答案 0 :(得分:2)

正则表达式(?<SessionID>\S+)\s+(?<HitNumber>\S+)\s+(?<FactValue>\S+)\s+(?<DimValue1>\S+)\s+(?<ErrMessage>.+)\s+(?<IP>(?:\d{1,3}\.){3}\d{1,3})\s+(?<TLT_URL>\S+)\s+(?<URL>\S+)\s+(?<Refferer>\S+)\s+(?<Method>\S+)\s+(?<StatsCode>\S+)\s+(?<ReqTime>\S+)

<强>详情:

  • (?<>)命名为Capture Group
  • \S匹配任何非空白字符
  • \d匹配一个数字{n,m}匹配nm
  • +匹配一次且无限次

Regex demo

<强>输出

{
  "SessionID": [
    [
      "22304999"
    ]
  ],
  "HitNumber": [
    [
      "5"
    ]
  ],
  "FactValue": [
    [
      "400.OUTPUT_SERVICE.510"
    ]
  ],
  "DimValue1": [
    [
      "submit"
    ]
  ],
  "ErrMessage": [
    [
      "The limit has been exceeded. Please use a different option."
    ]
  ],
  "IP": [
    [
      "2.54.44.221"
    ]
  ],
  "TLT_URL": [
    [
      "/api/output/v3/contract/:PCID/order"
    ]
  ],
  "URL": [
    [
      "/api/output/v3/contract/:pcid/order"
    ]
  ],
  "Refferer": [
    [
      "https://www.example.org/output/"
    ]
  ],
  "Method": [
    [
      "PUT"
    ]
  ],
  "StatsCode": [
    [
      "400"
    ]
  ],
  "ReqTime": [
    [
      "2017-09-28T15:50:57.843176Z"
    ]
  ]
}

答案 1 :(得分:2)

如果您正在测试解决方案,则可以随时使用此站点:

  

http://grokconstructor.appspot.com/do/match

我为你的问题制作了这个grok模式:

%{INT:SessionID}\s*%{INT:HitNumber}\s*%{NOTSPACE:FaceValue}\s*%{GREEDYDATA:ErrMessage}\s*%{IP:IP}\s*%{NOTSPACE:TLT_URL}\s*%{NOTSPACE:URL}\s*%{NOTSPACE:Referer}\s*%{NOTSPACE:Method}\s*%{INT:StatsCode}\s*%{TIMESTAMP_ISO8601:ReqTime}