Elasticsearch:为每个唯一ID过滤第一个文档

时间:2015-04-15 04:56:36

标签: search elasticsearch

我正在为以下场景撰写elasticsearch查询:

- field1    field2
- 2015      20
- 2015      14
- 2014      39
- 2013      76
- 2013      2
- 2013      55

我希望为每个唯一field2找到field1的总和,field2field1的最大值。 例如。在这种情况下,我想要value = 20 + 39 + 76

返回此值的elasticsearch查询是什么?

1 个答案:

答案 0 :(得分:0)

我不认为弹性搜索1.x可以使用单个查询。 在2.0中,我们可能会有reducers这样的功能(参见:https://github.com/elastic/elasticsearch/issues/8110)。

您可以获得任务的第一部分(field2的最大值按字段1分组),如下所示:

DELETE /test_index

PUT /test_index
{
    "settings": {
        "number_of_shards": 1
    }
}

POST /test_index/_bulk
{"index":{"_index":"test_index","_type":"doc","_id":1}}
{"field1":2015,"field2":20}
{"index":{"_index":"test_index","_type":"doc","_id":2}}
{"field1":2015,"field2":14}
{"index":{"_index":"test_index","_type":"doc","_id":3}}
{"field1":2014,"field2":39}
{"index":{"_index":"test_index","_type":"doc","_id":4}}
{"field1":2013,"field2":76}
{"index":{"_index":"test_index","_type":"doc","_id":5}}
{"field1":2013,"field2":2}
{"index":{"_index":"test_index","_type":"doc","_id":6}}
{"field1":2013,"field2":55}

POST /test_index/_search
{
  "size": 0,
  "aggs": {
    "field1_group": {
      "terms": {
        "field": "field1",
        "size": 0,
        "order": {
          "maksior": "asc"
        }
      },
      "aggs": {
        "maksior": {
          "max": {
            "field": "field2"
          }
        }
      }
    }
  }
}

会给你:

{
   "took": 1,
   "timed_out": false,
   "_shards": {
      "total": 1,
      "successful": 1,
      "failed": 0
   },
   "hits": {
      "total": 6,
      "max_score": 0,
      "hits": []
   },
   "aggregations": {
      "field1_group": {
         "doc_count_error_upper_bound": 0,
         "sum_other_doc_count": 0,
         "buckets": [
            {
               "key": 2015,
               "doc_count": 2,
               "maksior": {
                  "value": 20
               }
            },
            {
               "key": 2014,
               "doc_count": 1,
               "maksior": {
                  "value": 39
               }
            },
            {
               "key": 2013,
               "doc_count": 3,
               "maksior": {
                  "value": 76
               }
            }
         ]
      }
   }
}

然后你可以迭代结果并在客户端汇总它们。