Elasticsearch:过滤器聚合的准确性

时间:2016-02-18 10:21:08

标签: elasticsearch filter aggregation date-histogram

我对Elasticsearch很新(使用2.2版)。 为了简化我的问题,我的文档中有一个名为 termination 的字段,有时可以取值 transfer

我目前正在执行此请求,以按月汇总已终止的文档数量:

{
  "size": 0,
  "sort": [{
    "@timestamp": {
      "order": "desc",
      "unmapped_type": "boolean"
    }
  }],
  "query": { "match_all": {} },
  "aggs": {
    "report": {
      "date_histogram": {
        "field": "@timestamp",
        "interval": "month",
        "min_doc_count": 0
      },
      "aggs": {
        "documents_with_termination_transfer": {
          "filter": {
            "term": {
              "termination": "transfer"
            }
          }
        }
      }
    }
  }
}

以下是回复:

{
    "_shards": {
        "failed": 0, 
        "successful": 206, 
        "total": 206
    }, 
    "aggregations": {
        "report": {
            "buckets": [
                {
                    "calls_with_termination_transfer": {
                        "doc_count": 209163
                    }, 
                    "doc_count": 278100, 
                    "key": 1451606400000, 
                    "key_as_string": "2016-01-01T00:00:00.000Z"
                }, 
                {
                    "calls_with_termination_transfer": {
                        "doc_count": 107244
                    }, 
                    "doc_count": 136597, 
                    "key": 1454284800000, 
                    "key_as_string": "2016-02-01T00:00:00.000Z"
                }
            ]
        }
    }, 
    "hits": {
        "hits": [], 
        "max_score": 0.0, 
        "total": 414699
    }, 
    "timed_out": false, 
    "took": 90
}

为什么点击次数(414699)大于文件总数(278100 + 136597 = 414697)?我读过关于准确性问题的内容,但它似乎并不适用于过滤器...... 如果我用传输终止对文档的总数进行求和,是否还存在准确性问题?

1 个答案:

答案 0 :(得分:0)

我的猜测是有些文件缺少@timestamp

您可以在此字段上运行exists query来验证这一点。