Elasticsearch。具有重复值的嵌套字段上的术语聚合

时间:2017-04-20 12:15:12

标签: elasticsearch aggregation elasticsearch-5

我在Elasticsearch中遇到嵌套聚合问题。我有嵌套字段的映射:

POST my_index/ my_type / _mapping
{
    "properties": {
        "name": {
            "type": "keyword"
        },
        "nested_fields": {
            "type": "nested",
                "properties": {
                "key": {
                    "type": "keyword"
                },
                "value": {
                    "type": "keyword"
                }
            }
        }
    }
}

然后我将一个文档添加到索引:

POST my_index/ my_type
{
    "name":"object1",
        "nested_fields":[
            {
                "key": "key1",
                "value": "value1"

            },
            {
                "key": "key1",
                "value": "value2"
            }
        ]
}

如您所见,在我的嵌套数组中,我有两个项目,它们具有相似的key字段,但具有不同的value字段。然后我想提出这样的问题:

GET / my_index / my_type / _search
{
    "query": {
        "nested": {
            "path": "nested_fields",
                "query": {
                "bool": {
                    "must": [
                        {
                            "term": {
                                "nested_fields.key": {
                                    "value": "key1"
                                }
                            }
                        },
                        {
                            "terms": {
                                "nested_fields.value": [
                                    "value1",
                                    "value2"
                                ]
                            }
                        }
                    ]
                }
            }
        }
    },
    "aggs": {
        "agg_nested_fields": {
            "nested": {
                "path": "nested_fields"
            },
            "aggs": {
                "agg_nested_fields_key": {
                    "terms": {
                        "field": "nested_fields.key",
                            "size": 10
                    }
                }
            }
        }
    }
}

如您所见,我想查找所有文档,其中nested_field数组中至少有一个对象,key属性等于key1,并且提供了一个值{{{ 1}}或value1)。然后我想通过value2对创建的文档进行分组。但我有这样的回应

nested_fields.key

从响应中可以看出,我有一次点击(这是正确的),但是文档在聚合中被计算了两次(参见{ "took": 13, "timed_out": false, "_shards": { "total": 5, "successful": 5, "failed": 0 }, "hits": { "total": 1, "max_score": 0.87546873, "hits": [ { "_index": "my_index", "_type": "my_type", "_id": "AVuLNXxiryKmA7VEwOfV", "_score": 0.87546873, "_source": { "name": "object1", "nested_fields": [ { "key": "key1", "value": "value1" }, { "key": "key1", "value": "value2" } ] } } ] }, "aggregations": { "agg_nested_fields": { "doc_count": 2, "agg_nested_fields_key": { "doc_count_error_upper_bound": 0, "sum_other_doc_count": 0, "buckets": [ { "key": "key1", "doc_count": 2 } ] } } } } ),因为它有两个项目,其中'key1'值在{ {1}}数组。如何在聚合中获得正确的计数?

1 个答案:

答案 0 :(得分:0)

您必须在嵌套聚合中使用reverse_nested aggs才能返回根文档上的聚合计数。

{
    "query": {
        "nested": {
            "path": "nested_fields",
            "query": {
                "bool": {
                    "must": [{
                            "term": {
                                "nested_fields.key": {
                                    "value": "key1"
                                }
                            }
                        },
                        {
                            "terms": {
                                "nested_fields.value": [
                                    "value1",
                                    "value2"
                                ]
                            }
                        }
                    ]
                }
            }
        }
    },
    "aggs": {
        "agg_nested_fields": {
            "nested": {
                "path": "nested_fields"
            },
            "aggs": {
                "agg_nested_fields_key": {
                    "terms": {
                        "field": "nested_fields.key",
                        "size": 10
                    },
                    "aggs": {
                        "back_to_root": {
                            "reverse_nested": {
                                "path": "_source"
                            }
                        }
                    }
                }
            }
        }
    }
}