Elasticsearch获取不同行的计数

时间:2014-07-10 06:34:02

标签: elasticsearch aggregate-functions

我需要使用elasticsearch

查找不同字段ID的计数

我的数据格式是

{
            "_index": "outboxprov1",
            "_type": "user",
            "_id": "16bcd4dc080f4c789018dd97f76741ef",
            "_score": 1,
            "_source": {
               "first_name": "jinu",
               "team_id": "500"
            }
         },
         {
            "_index": "outboxprov1",
            "_type": "user",
            "_id": "9ed8afe738aa63c28b66994cef1f83c6",
            "_score": 1,
            "_source": {
               "first_name": "lal",
               "team_id": "500"
            }
         },
         {
            "_index": "outboxprov1",
            "_type": "user",
            "_id": "1d238cd2f8c06790fc20859a16e3183b",
            "_score": 1,
            "_source": {
               "first_name": "author1",
               "team_id": "500"
            }
         },
         {
            "_index": "outboxprov1",
            "_type": "user",
            "_id": "616ee1c00a02564f71bb6c3067054d55",
            "_score": 1,
            "_source": {
               "first_name": "kannan",
               "team_id": "400"
            }
         },
         {
            "_index": "outboxprov1",
            "_type": "user",
            "_id": "d48132bfaed792f3c32d12e310d41c87",
            "_score": 1,
            "_source": {
               "first_name": "author3",
               "team_id": "400"
            }
         },
         {
            "_index": "outboxprov1",
            "_type": "user",
            "_id": "1a9d05586a8dc3f29b4c8147997391f9",
            "_score": 1,
            "_source": {
               "first_name": "dibish",
               "team_id": "100"
            }
         }

      ]
   } 

这里有三个不同的team_id500, 400, 100。在这种情况下,我想将计数设为3.我已经尝试过基数聚合:

{
  "size": 0, 
    "query" : {
        "match_all" : {  }
    },
    "aggs" : {
        "team_id_count" : {
            "cardinality" : {
                "field" : "team_id"
            }
        }
    }

}

这里得到了正确的结果,但我可以看到elasticsearch文档声明基数是实验性功能,并且可能会在将来发生变化。

有没有办法在不使用基数聚合的情况下实现这一目标?使用这个实验性基数函数有什么问题吗?请指导我正确的方向。

1 个答案:

答案 0 :(得分:2)

您可以使用terms aggregation

像这样:

curl -XPOST http://localhost:9200/outboxprov1/user/_search -d '
{
  "size": 0,
    "query" : {
        "match_all" : {  }
    },
    "aggs" : {
        "team_id_count" : {
            "terms" : {
                "field" : "team_id"
            }
        }
    }

}'