Question

我的索引包含以下文档：

[
    {
        "name": "Marco",
        "city_id": 45,
        "city": "Rome"
    },
    {
        "name": "John",
        "city_id": 46,
        "city": "London"
    },
    {
        "name": "Ann",
        "city_id": 47,
        "city": "New York"
    },
    ...
]

和汇总：

"aggs": {
    "city": {
        "terms": {
            "field": "city"
        }
    }
}

这给了我这样的答复：

{
    "aggregations": {    
        "city": {
            "doc_count_error_upper_bound": 0,
            "sum_other_doc_count": 694,
            "buckets": [
                {
                    "key": "Rome",
                    "doc_count": 15126
                },
                {
                    "key": "London",
                    "doc_count": 11395
                },
                {
                    "key": "New York",
                    "doc_count": 14836
                },
                ...
          ]
        },
        ...
    }
}

我的问题是我需要在我的聚合结果上加city_id。我一直在阅读here我不能有多字段术语聚合，但我不需要聚合两个字段，而只需返回另一个字段，对于每个术语字段总是相同的（基本上是city / city_id对）。在不失去绩效的情况下实现这一目标的最佳方法是什么？

我可以使用city_with_id，"Rome;45"等值创建名为"London;46"的字段，并通过此字段进行聚合。对我而言，它可以工作，因为我可以简单地将结果分割到我的后端并获得我需要的ID，但这是最好的方法吗？

Answer 1

一种方法是使用top_hits并使用源过滤仅返回city_id，如下例所示。我不认为这会低得多您可以尝试在索引上查看影响，然后再尝试在OP中指定的city_name_id字段的方法。

示例：

    post <index>/_search
    {
        "size" : 0,
        "aggs": {
            "city": {
                "terms": {
                    "field": "city"
                },
                "aggs" : {
                    "id" : {
                        "top_hits" : {
                            "_source": {
                                "include": [
                                    "city_id"
                                ]
                            },
                            "size" : 1
                        }
                    }
                }
            }
        }
    }

结果：

 {
               "key": "London",
               "doc_count": 2,
               "id": {
                  "hits": {
                     "total": 2,
                     "max_score": 1,
                     "hits": [
                        {
                           "_index": "country",
                           "_type": "city",
                           "_id": "2",
                           "_score": 1,
                           "_source": {
                              "city_id": 46
                           }
                        }
                     ]
                  }
               }
            },
            {
               "key": "New York",
               "doc_count": 1,
               "id": {
                  "hits": {
                     "total": 1,
                     "max_score": 1,
                     "hits": [
                        {
                           "_index": "country",
                           "_type": "city",
                           "_id": "3",
                           "_score": 1,
                           "_source": {
                              "city_id": 47
                           }
                        }
                     ]
                  }
               }
            },
            {
               "key": "Rome",
               "doc_count": 1,
               "id": {
                  "hits": {
                     "total": 1,
                     "max_score": 1,
                     "hits": [
                        {
                           "_index": "country",
                           "_type": "city",
                           "_id": "1",
                           "_score": 1,
                           "_source": {
                              "city_id": 45
                           }
                        }
                     ]
                  }
               }
            }

多字段术语聚合方法

1 个答案: