Logstash:将URL参数转换为哈希值

时间:2016-10-06 14:35:41

标签: url filter logstash

我尝试使用Logstash和ElasticSearch来监控我的Apache网络服务器活动。在这个时候,它工作得很好,但我需要有关我的请求字段的更具体的信息。 此时我的logstash配置为:

filter {
  grok { match => { "message" => "%{COMBINEDAPACHELOG}" } }
  grok { match => { "request" => [ "url", "%{URIPATH:url_path}%{URIPARAM:url_params}?" ]} }
   urldecode{ field => "url_path" }
   mutate { gsub =>  ["url_params","\?","" ] }
   kv {
     field_split => "&"
     source => "url_params"
     prefix => "url_param_"
   }
   date { match => [ "timestamp" , "dd/MMM/yyyy:HH:mm:ss Z" ] }
   geoip { source => "clientip" }
   useragent { source => "agent" }
 }

采用基本的apache日志:

255.254.230.10 - - [11/Dec/2013:00:01:45 -0800] "GET /xampp/boreal%3A123456/status.php?pretty=true&test=boreal%3A12345 HTTP/1.1" 200 3891 "http://cadenza/xampp/navi.php" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:25.0) Gecko/20100101 Firefox/25.0"

第一个配置的结果是:

{
         "message" => "255.254.230.10 - - [11/Dec/2013:00:01:45 -0800] \"GET /xampp/boreal%3A123456/status.php?pretty=true&test=boreal:%3A12345 HTTP/1.1\" 200 3891 \"http://cadenza/xampp/navi.php\" \"Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:25.0) Gecko/20100101 Firefox/25.0\"",
        "@version" => "1",
      "@timestamp" => "2013-12-11T08:01:45.000Z",
            ...
         "request" => "/xampp/boreal%3A123456/status.php?pretty=true&test=boreal%3A12345",
        "url_path" => "/xampp/boreal:123456/status.php",
      "url_params" => "pretty=true&test=boreal%3A12345",
"url_param_pretty" => "true",
  "url_param_test" => "boreal%3A12345",
           ...    
}

而且(在梦境中),我想对url params做出回应:

{
         ...
         "request" => "/xampp/boreal%3A123456/status.php?pretty=true&test=boreal%3A12345",
        "url_path" => "/xampp/boreal:123456/status.php",
      "url_params" => {
                "pretty" => "true",
        "url_param_test" => "boreal:12345"
      },
           ...    
}

我的嘘声

  • url_params成为哈希数组。
  • 此哈希的每个键都将是param的名称
  • 每个对应的值都是urldecode值

问题

  • 我是否需要创建自己的插件(我还不熟悉ruby)?
  • 它是否存在一个现有的插件(我没有找到......也许是糟糕的搜索)?
  • 没有插件可以这样做吗?

感谢您的帮助(对不起我的英语)

Renaud

解决方案:

感谢Val,他找到了解决方案。我将配置更改为:

grok { match => { "request" => [ "url", "%{URIPATH:url_path}%{URIPARAM:url_params}?" ]} }
urldecode{ field => "url_path" }
mutate { gsub =>  ["url_params","\?","" ] }
kv {
  field_split => "&"
  source => "url_params"
  target => "url_params_hash"
}
urldecode{ field => "url_params_hash" }

使用此解决方案,即使"&"(%26)字符进入url_params字符串,分割也是正确的。

1 个答案:

答案 0 :(得分:2)

使用// Year changed to 1990 console.log(new Date()); setTimeout(function(){ console.log(new Date()); }, 5000); > Thu Oct 11 1990 17:02:17 GMT+1100 (AUS Eastern Daylight Time) > Thu Oct 11 1990 17:02:22 GMT+1100 (AUS Eastern Daylight Time) 过滤器几乎可以正常使用。您需要稍微更改其配置。

您还需要在另一个路径之后为kv添加另一个urldecode过滤器

url_params

你会得到这样的东西:

urldecode{ field => "url_path" }
urldecode{ field => "url_params" }
mutate { gsub =>  ["url_params","\?","" ] }
kv {
  field_split => "&"
  source => "url_params"
  target => "url_params_hash"
}