Elasticsearch聚合查询中的嵌套过滤器

时间:2020-01-08 09:32:06

标签: elasticsearch elasticsearch-aggregation

我正在使用嵌套过滤器运行以下聚合查询

GET <indexname>/_search
{
  "aggs": {
    "NAME": {
      "nested": {
        "path": "crm.LeadStatusHistory"
      },
      "aggs": {
        "agg_filter": {
          "filter": {
            "bool": {
              "must": [
                {
                  "nested": {
                    "path": "crm",
                    "query": {
                      "terms": {
                        "crm.City.keyword": [
                          "Rewa"
                        ]
                      }
                    }
                  }
                },
                {
                  "nested": {
                    "path": "crm",
                    "query": {
                      "terms": {
                        "crm.LeadID": [
                          27961
                        ]
                      }
                    }
                  }
                }
              ]
            }
          },
          "aggs": {
            "agg_terms":{
              "terms": {
                "field": "crm.LeadStatusHistory.StatusID",
                "size": 1000
              }
            }
          }
        }
      }
    }
  }
}

----->我有以下文档

{
        "_index" : "crm",
        "_type" : "_doc",
        "_id" : "4478",
        "_score" : 1.0,
        "_source" : {
          "crm" : [
            {
              "LeadStatusHistory" : [
                {
                  "StatusID" : 3
                },
                {
                  "StatusID" : 2
                },
                {
                  "StatusID" : 1
                }
              ],
              "LeadID" : 27961,
              "City" : "Rewa"
            },
            {
              "LeadStatusHistory" : [
                {
                  "StatusID" : 1
                },
                {
                  "StatusID" : 3
                },
                {
                  "StatusID" : 2
                }
              ],
              "LeadID" : 27959,
              "City" : "Rewa"
            }
          ]
        }
      }]

但是,作为回应,我得到了以下结果

"aggregations" : {
    "NAME" : {
      "doc_count" : 4332,
      "agg_filter" : {
        "doc_count" : 1,
        "agg_terms" : {
          "doc_count_error_upper_bound" : 0,
          "sum_other_doc_count" : 0,
          "buckets" : [
            {
              "key" : 1,
              "doc_count" : 1
            }
          ]
        }
      }
    }
  }

Query ===>根据源文档,我有3个嵌套的crm.LeadID = 27961'crm.LeadStatusHistory'文档。但是,结果显示agg_filter等于1而不是3。请让我知道这种情况的原因。

2 个答案:

答案 0 :(得分:0)

您的agg_filter位于crm上。LeadStatusHistory=>将仅针对1个文档(LeadStatusHistory是一个文档,在您的情况下,它指向其他文档的链接)。

我建立了一个查询来显示该问题,然后我将回答您的问题。您将为每个聚合看到不同的doc_count。

{
  "size": 0,
  "aggs": {
    "NAME": {
      "nested": {
        "path": "crm"
      },
      "aggs": {
        "agg_LeadID": {
          "terms": {
            "field": "crm.LeadID"
          },
          "aggs": {
            "agg_LeadStatusHistory": {
              "nested": {
                "path": "crm.LeadStatusHistory"
              },
              "aggs": {
                "home_type_name": {
                  "terms": {
                    "field": "crm.LeadStatusHistory.StatusID"
                  }
                }
              }
            }
          }
        }
      }
    }
  }
}

使用此脚本,您可以使用脚本(如果需要,可以过滤)对它们进行计数:

{
  "size": 0,
  "aggs": {
    "NAME": {
      "nested": {
        "path": "crm"
      },
      "aggs": {
        "agg_LeadID": {
          "terms": {
            "field": "crm.LeadID"
          },
          "aggs": {
            "agg_LeadStatusHistory": {
              "nested": {
                "path": "crm.LeadStatusHistory"
              },
              "aggs": {
                "agg_LeadStatusHistory_sum": {
                  "sum": {
                    "script": "doc['crm.LeadStatusHistory.StatusID'].values.length"
                  }
                }
              }
            }
          }
        }
      }
    }
  }
}

注意:如果要获取嵌套文档的数量,请查看inner_hits: https://www.elastic.co/guide/en/elasticsearch/reference/current/search-request-body.html#request-body-search-inner-hits

答案 1 :(得分:0)

与“ crm.LeadStatusHistory”中的一个文档的响应不同。我没有过滤器对crm.LeadstatusHistory运行聚合查询。

GET crm/_search
{
  "_source": ["crm.LeadID","crm.LeadStatusHistory.StatusID","crm.City"], 
  "size": 10000,
  "query": {
    "nested": {
      "path": "crm",
      "query": {
        "match": {
          "crm.LeadID": "27961"
        }
      }
    }
  }, 
  "aggs": {
    "agg_statuscount": {
      "nested": {
        "path": "crm.LeadStatusHistory"
      },
          "aggs": {
            "agg_terms":{
              "terms": {
                "field": "crm.LeadStatusHistory.StatusID",
                "size": 1000
              }
            }
          }
        }
      }
    }

我从上述查询中得到以下响应,该查询显示'agg_statuscount'为6个没有过滤器的文档

{
  "took" : 6,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 1,
      "relation" : "eq"
    },
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "crm",
        "_type" : "_doc",
        "_id" : "4478",
        "_score" : 1.0,
        "_source" : {
          "crm" : [
            {
              "LeadStatusHistory" : [
                {
                  "StatusID" : 3
                },
                {
                  "StatusID" : 2
                },
                {
                  "StatusID" : 1
                }
              ],
              "LeadID" : 27961,
              "City" : "Rewa"
            },
            {
              "LeadStatusHistory" : [
                {
                  "StatusID" : 1
                },
                {
                  "StatusID" : 3
                },
                {
                  "StatusID" : 2
                }
              ],
              "LeadID" : 27959,
              "City" : "Rewa"
            }
          ]
        }
      }
    ]
  },
  "aggregations" : {
    "agg_statuscount" : {
      "doc_count" : 6,
      "agg_terms" : {
        "doc_count_error_upper_bound" : 0,
        "sum_other_doc_count" : 0,
        "buckets" : [
          {
            "key" : 1,
            "doc_count" : 2
          },
          {
            "key" : 2,
            "doc_count" : 2
          },
          {
            "key" : 3,
            "doc_count" : 2
          }
        ]
      }
    }
  }
}

因此在聚合过滤器中crm.LeadID = 27961,因此我希望使用3个“ crm.LeadStatusHistory”文档。目前,我的原始回答为1。