Question

我有一个包含2个字段和一些文档的索引，如下所示：

city                team
=========================================
New York            New York Knicks
New York            Brooklyn Nets
New Orleans         New Orleans Pelicans

我的目标是提供在两个字段上搜索的自动填充功能，如下所示：

Query: [ new                  ]
       +----------------------+
       |     Cities           |
       +----------------------+
       | New York             |
       | New Orleans          |
       +----------------------|
       |     Teams            |
       +----------------------|
       | New York Knicks      |
       | New Orleans Pelicans |
       +----------------------+

我过滤文档的查询非常简单：

"query": {
    "bool": {
        "should": [
            {
                "match_phrase_prefix": {
                    "city": "new"
                }
            },
            {
                "match_phrase_prefix": {
                    "team": "new"
                }
            }
        ]
    }
}

但是，我遇到了聚合问题。我的第一个方法是：

"aggs": {
    "city": {
        "terms": {
            "field": "city.raw"
        }
    },
    "team": {
        "terms": {
            "field": "team.raw"
        }
    }
}

（raw是用于汇总目的的not_analyzed字段副本）

这不起作用，因为结果中包含了Brooklyn Nets - 它不应该：

"aggregations": {
    "city": {
        "doc_count_error_upper_bound": 0,
        "sum_other_doc_count": 0,
        "buckets": [
            {
                "key": "New York",
                "doc_count": 2
            },
            {
                "key": "New Orleans",
                "doc_count": 1
            }
        ]
    },
    "team": {
        "doc_count_error_upper_bound": 0,
        "sum_other_doc_count": 0,
        "buckets": [
            {
                "key": "Brooklyn Nets",
                "doc_count": 1
            },
            {
                "key": "New Orleans Pelicans",
                "doc_count": 1
            },
            {
                "key": "New York Knicks",
                "doc_count": 1
            }
        ]
    }
}

我不知道如何使用单个请求让它工作。这个例子只是说明性的，在实际场景中我有更多的字段和文档要搜索和聚合，因此向服务器发出多个请求不是一个好主意，特别是因为自动完成系统应该尽可能快。 / p>

任何帮助将不胜感激。

Answer 1

您需要使用过滤器聚合来根据查询本身中的过滤器过滤要聚合的文档：

  "aggs": {
    "city": {
      "filter": {
        "bool": {
          "must": [
            {
              "query": {
                "match_phrase_prefix": {
                  "city": "new"
                }
              }
            }
          ]
        }
      },
      "aggs": {
        "cities": {
          "terms": {
            "field": "city.raw"
          }
        }
      }
    },
    "team": {
      "filter": {
        "bool": {
          "must": [
            {
              "query": {
                "match_phrase_prefix": {
                  "team": "new"
                }
              }
            }
          ]
        }
      },
      "aggs": {
        "cities": {
          "terms": {
            "field": "team.raw"
          }
        }
      }
    }
  }

Answer 2

您的查询，

"query": {
    "bool": {
        "should": [
            {
                "match_phrase_prefix": {
                    "city": "new"
                }
            },
            {
                "match_phrase_prefix": {
                    "team": "new"
                }
            }
        ]
    }
}

在结果中返回带有“城市：纽约队：布鲁克林篮网”的文件。因为“城市”字段的前缀是“新”，即使“团队”字段没有。

我认为当您使用聚合时，文档与“城市：纽约队：布鲁克林篮网”将被计算在内。由于“城市：纽约”，“Team：Brooklyn Nets”文档包含在查询的结果集中，并且它被计入桶中。

如果要进行检查，请将minimum_should_match设置为2.

Elasticsearch分别由多个字段聚合

2 个答案: