ElasticSearch - 如何聚合访问日志忽略GET参数?

时间:2017-07-14 08:04:36

标签: elasticsearch

我想通过功能路径聚合访问。

{
  "query": {
    "bool": {
      "must": [
        {
          "wildcard": {
            "path.keyword": "/hex/*"
          }
        }
      ]
    }
  },
  "from": 0,
  "size": 0,
  "aggs": {
    "path": {
      "terms": {
        "field": "path.keyword"
      }
    }
  }
}

我得到的结果就像这些......

{
  "key": "/hex/user/admin_user/auth",
  "doc_count": 38
},
{
  "key": "/hex/report/chart/fastreport_lobby_all?start_date=2017-06-29&end_date=2017-07-05&category=date_range&value[]=payoff",
  "doc_count": 35
},
{
  "key": "/hex/report/chart/fastreport_lobby_all?start_date=2017-06-29&end_date=2017-07-05&category=lobby&value[]=payoff",
  "doc_count": 35
},
{
  "key": "/hex/report/chart/online_membership?start_date=2017-06-29&end_date=2017-07-05&category=datetime_range&value[]=user_total",
  "doc_count": 34
}

有两个/ hex / report / chart / fastreport_lobby_all?balabala ...结果。

这不是关于这个功能的真实计数。

我有任何方法可以将它们统计为一个吗?

{
  "key": "/hex/report/chart/fastreport_lobby_all",
  "doc_count": 70
}

1 个答案:

答案 0 :(得分:1)

如果没有像

这样的自定义分析器,我认为这是不可能的
PUT your_index
{
   "settings": {
      "analysis": {
         "analyzer": {
            "query_analyzer": {
               "type": "custom",
               "tokenizer": "split_query",
               "filter": ["top1"
               ]
            }
         },
         "filter":{
            "top1":{
                     "type": "limit",
                     "max_token_count": 1
                  }
         },
         "tokenizer":{
             "split_query":{
                  "type": "pattern",
                  "pattern": "\\?"
               }
         }
      }
   },
   "mappings": {
      "your_log_type": {
         "properties": {
            "path": {
               "type": "text",
               "fields": {
                  "keyword": {
                      "type":"keyword"
                  },
                  "no_query": {
                      "type":"string",
                      "fielddata":true,
                      "analyzer":"query_analyzer"
                  }
               }
            }
         }
      }
   }
}

然后查询

POST test/log_type/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "wildcard": {
            "path.keyword": "/hex/*"
          }
        }
      ]
    }
  },
  "from": 0,
  "size": 0,
  "aggs" : {
        "genres" : {
            "terms" : { "field" : "path.no_query" }
        }
    }
}