我有一些文件:
{"name": "John", "district": 1},
{"name": "Mary", "district": 2},
{"name": "Nick", "district": 1},
{"name": "Bob", "district": 3},
{"name": "Kenny", "district": 1}
如何按地区过滤/选择不同的文件?
{"name": "John", "district": 1},
{"name": "Mary", "district": 2},
{"name": "Bob", "district": 3}
在SQL中,我可以使用GROUP BY。我尝试了术语聚合,但它只返回计数不同。
"aggs": {
"distinct": {
"terms": {
"field": "district",
"size": 0
}
}
}
感谢您的帮助! : - )
答案 0 :(得分:35)
如果你的ElasticSearch版本是1.3或更高版本,你可以使用top_hits类型的子聚合,它会(默认情况下)为你提供在查询得分上排序的前三个匹配文档(这里,你使用的是1) match_all查询)。
您可以将size
参数设置为3以上。
以下数据集和查询:
POST /test/districts/
{"name": "John", "district": 1}
POST /test/districts/
{"name": "Mary", "district": 2}
POST /test/districts/
{"name": "Nick", "district": 1}
POST /test/districts/
{"name": "Bob", "district": 3}
POST test/districts/_search
{
"size": 0,
"aggs":{
"by_district":{
"terms": {
"field": "district",
"size": 0
},
"aggs": {
"tops": {
"top_hits": {
"size": 10
}
}
}
}
}
}
将以您希望的方式输出文档:
{
"took": 5,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 4,
"max_score": 0,
"hits": []
},
"aggregations": {
"by_district": {
"buckets": [
{
"key": 1,
"key_as_string": "1",
"doc_count": 2,
"tops": {
"hits": {
"total": 2,
"max_score": 1,
"hits": [
{
"_index": "test",
"_type": "districts",
"_id": "XYHu4I-JQcOfLm3iWjTiOg",
"_score": 1,
"_source": {
"name": "John",
"district": 1
}
},
{
"_index": "test",
"_type": "districts",
"_id": "5dul2XMTRC2IpV_tKRRltA",
"_score": 1,
"_source": {
"name": "Nick",
"district": 1
}
}
]
}
}
},
{
"key": 2,
"key_as_string": "2",
"doc_count": 1,
"tops": {
"hits": {
"total": 1,
"max_score": 1,
"hits": [
{
"_index": "test",
"_type": "districts",
"_id": "I-9Gd4OYSRuexhP1dCdQ-g",
"_score": 1,
"_source": {
"name": "Mary",
"district": 2
}
}
]
}
}
},
{
"key": 3,
"key_as_string": "3",
"doc_count": 1,
"tops": {
"hits": {
"total": 1,
"max_score": 1,
"hits": [
{
"_index": "test",
"_type": "districts",
"_id": "bti2y-OUT3q2mBNhhI3xeA",
"_score": 1,
"_source": {
"name": "Bob",
"district": 3
}
}
]
}
}
}
]
}
}
}
答案 1 :(得分:2)
弹性搜索不会按值或按唯一值提供不同的文档。 但是,如果您正在使用Java客户端或者可以用适当的语言转换它,那么可以解决这个问题
SearchResponse response = client.prepareSearch().execute().actionGet();
SearchHits hits = response.getHits();
Iterator<SearchHit> iterator = hits.iterator();
Map<String, SearchHit> distinctObjects = new HashMap<String,SearchHit>();
while (iterator.hasNext()) {
SearchHit searchHit = (SearchHit) iterator.next();
Map<String, Object> source = searchHit.getSource();
if(source.get("district") != null){
distinctObjects.put(source.get("district").toString(),source);
}
}