如何按字段过滤top_hits聚合

时间:2019-09-15 03:21:26

标签: elasticsearch elasticsearch-aggregation

在600M文档的很大索引上创建查询时,我遇到了一些问题。我快解决了,但是我被卡住了。

我拥有的文档类型如下:

:)

我需要为每个公司返回2位联系人,其中{ "first_name" : "John", "last_name" : "Doe", "company_domain" : "google", "provider_a_id" : "1234", "provider_b_id" : "14" } 与我之前获得的ID列表匹配。

我得出的这种汇总结果是每个公司返回2个联系人:

provider_a_id

这很好,因为我可以解决一部分问题,但是问题是我现在还需要使用{ "size": 0, "aggs": { "COMPANIES": { "terms": { "field": "company_domain.keyword", "order": { "_key": "asc" }, "size": 2 }, "aggs": { "EMPLOYEES": { "top_hits": { "size": 2 } } } } } } 缩小搜索范围。需要做类似的事情:

provider_a_id

你知道我该怎么解决吗?

2 个答案:

答案 0 :(得分:1)

您需要在top_hits之前使用过滤器聚合。 我已经过滤了一个值(条件),您可以使用条件聚合对列表进行过滤

映射

PUT testindex7/_mappings
{
  "properties": {
    "first_name" :{
      "type": "text"
    },
    "last_name" : {
      "type": "text"
    },
    "company_domain" :{
      "type": "text",
      "fields": {
         "keyword":{
           "type": "keyword"
         }  
      }
    },
    "provider_a_id" : {
      "type": "integer"
    },
    "provider_b_id" : {
      "type": "integer"
    }
  }
}

数据:

 [
      {
        "_index" : "testindex7",
        "_type" : "_doc",
        "_id" : "OvU4OG0BCNyxVsPT3Xtn",
        "_score" : 1.0,
        "_source" : {
          "first_name" : "a",
          "last_name" : "b",
          "company_domain" : "google",
          "provider_a_id" : "100",
          "provider_b_id" : "1"
        }
      },
      {
        "_index" : "testindex7",
        "_type" : "_doc",
        "_id" : "O_U5OG0BCNyxVsPTAHsD",
        "_score" : 1.0,
        "_source" : {
          "first_name" : "c",
          "last_name" : "d",
          "company_domain" : "google",
          "provider_a_id" : "101",
          "provider_b_id" : "2"
        }
      },
      {
        "_index" : "testindex7",
        "_type" : "_doc",
        "_id" : "PPU5OG0BCNyxVsPTJ3tZ",
        "_score" : 1.0,
        "_source" : {
          "first_name" : "e",
          "last_name" : "f",
          "company_domain" : "google",
          "provider_a_id" : "102",
          "provider_b_id" : "3"
        }
      }
    ]

查询:

GET testindex7/_search
{
  "size": 0,
  "aggs": {
    "COMPANIES": {
      "terms": {
        "field": "company_domain.keyword",
        "order": {
          "_key": "asc"
        },
        "size": 2
      },
      "aggs": {
        "EMPLOYEES": {
          "filter": { 
            "terms": {
              "provider_a_id": [100,101]
            }
          },
          "aggs": {
            "top_emps": {
              "top_hits": {
                "size": 2
              }
            }
          }
        }
      }
    }
  }
}

结果:

"aggregations" : {
    "COMPANIES" : {
      "doc_count_error_upper_bound" : 0,
      "sum_other_doc_count" : 0,
      "buckets" : [
        {
          "key" : "google",
          "doc_count" : 3,
          "EMPLOYEES" : {
            "doc_count" : 2,
            "top_emps" : {
              "hits" : {
                "total" : {
                  "value" : 2,
                  "relation" : "eq"
                },
                "max_score" : 1.0,
                "hits" : [
                  {
                    "_index" : "testindex7",
                    "_type" : "_doc",
                    "_id" : "OvU4OG0BCNyxVsPT3Xtn",
                    "_score" : 1.0,
                    "_source" : {
                      "first_name" : "a",
                      "last_name" : "b",
                      "company_domain" : "google",
                      "provider_a_id" : "100",
                      "provider_b_id" : "1"
                    }
                  },
                  {
                    "_index" : "testindex7",
                    "_type" : "_doc",
                    "_id" : "O_U5OG0BCNyxVsPTAHsD",
                    "_score" : 1.0,
                    "_source" : {
                      "first_name" : "c",
                      "last_name" : "d",
                      "company_domain" : "google",
                      "provider_a_id" : "101",
                      "provider_b_id" : "2"
                    }
                  }
                ]
              }
            }
          }
        }
      ]
    }
  }

答案 1 :(得分:1)

使用aggs查询

"query":{
        "term":{
          "provider_a_id":"1234"        
        }
    },
"aggs": {
    "COMPANIES": {
      "terms": {
        "field": "company_domain.keyword",
        "order": { "_key": "asc" }, 
        "size": 2
      },
      "aggs": {
        "EMPLOYEES": {
          "top_hits": {
            "size": 2
          }
        }
      }
    }
  }