将http_referer url参数和查询字符串拆分为人类可读数据

时间:2017-12-20 07:24:33

标签: logstash elastic-stack

我的nginx日志格式如下

165.225.106.84 - - [20/Dec/2017:12:44:45 +0530] "POST /api/auction/auctionmaster/onauctionmasterfilter HTTP/1.1" 200 3227 "http://auction-dev.iquippo.com/viewauctions?type=upcoming" "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/63.0.3239.84 Safari/537.36" "115.112.162.2" "{\x22auctionType\x22:\x22upcoming\x22,\x22addAuctionType\x22:true}"

我想像我这样分割我的http_referer值

domain:- http://auction-dev.iquippo.com
param1 :- viewauctions
param2:- if any
query_param1:- upcoming
and so on..

我正在弹性搜索论坛上尝试这篇文章: - https://discuss.elastic.co/t/extracting-domain-from-url/36219

但它不适合我。

1 个答案:

答案 0 :(得分:0)

注意:可能存在拼写错误,您无法直接复制粘贴,但这是您尝试执行的操作的开始。

首先将您的推荐存储在变量中,然后使用add_tag添加标记,稍后在标记中执行if标记。

grok {
    match => { "access_log_line" => "%{LINE_WITH_REFERAL}"}
    add_tag => [ "referal" ]
}

if "referal" in [tags] {
    grok {
        match => { "referal" => "%{POST0}" }
        add_tag => [ "referal_step2" ]
    }
}

if "referal" in [tags] {
    grok {
        match => [ "referal_uri" => "%{POST_COMP}" ]
    }
}

示例行:

3227 "http://auction-dev.iquippo.com/viewauctions?type=upcoming"    
2522 "http://auction-dev.iquippo.com/viewauctions?foo?type=upcoming"    
327 "http://auction-dev.iquippo.com/viewauctions?foo?bar?type=upcoming"

整行的第一个GROK模式:

POST0 %{INT} "http://%{IPORHOST}/%{WORD:uri}\?%{GREEDYDATA:data}

与你的参数匹配的GrokPatterns:

POST1 type=%{WORD:query_param}"
POST2 %{WORD:param1}?type=%{WORD:query_param}"
POST3 %{WORD:param1}?%{WORD:param2}?type=%{WORD:query_param}"
POST4 %{WORD:param2}?%{WORD:param2}?%{WORD:param3}?type=%{WORD:query_param}"

POST_COMP %{POST1}|%{POST2}|%{POST3}|%{POST4}