使用Elasticsearch获取文档中特定字段的计数

时间:2017-02-08 07:41:28

标签: elasticsearch elasticsearch-aggregation

要求: 我想找到特定类别ID的aID计数。 (即对于categoryID 2532,我希望计数为2表示它被分配给两个aID)。

我尝试使用聚合但是我能够只获得文档计数而不是字段数。

映射

 "List": {
            "properties": {

              "aId": {
                "type": "long"
              },
              "CategoryList": {
                "properties": {                  
                  "categoryId": {
                    "type": "long"
                  },
                  "categoryName": {
                    "type": "string"
                  }
                }
              }              
            }
          }

示例文档:

"List": [
            {
              "aId": 33074,           
              "CategoryList": [
                {
                  "categoryId": 2532,
                  "categoryName": "VODAFONE"                
                }
              ]
            },
        {
              "aId": 12074,           
              "CategoryList": [
                {
                  "categoryId": 2532,
                  "categoryName": "VODAFONE"                
                }
              ]
            },

        {
              "aId": 120755,           
              "CategoryList": [
                {
                  "categoryId": 1234,
                  "categoryName": "SMPLKE"                
                }
              ]
            }
          ]

1 个答案:

答案 0 :(得分:-1)

使用基数聚合无法帮助您获得所需的结果。基数聚合返回字段的不同值的计数,您希望在哪里找到字段的外观计数。

您可以使用以下查询,您可以在此处首先过滤CategoryList.categoryId的文档,然后在此字段上运行简单的字词聚合

POST index_name1111/_search
{
    "query": {
        "bool": {
            "must": [{
                "term": {
                    "CategoryList.categoryId": {
                        "value": 2532
                    }
                }
            }]
        }
    },
    "aggs": {
        "count_is": {
            "terms": {
                "field": "CategoryList.categoryId",
                "size": 10
            }
        }
    }
}

以上查询的回复 -

{
  "took": 3,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
  },
  "hits": {
    "total": 2,
    "max_score": 0,
    "hits": []
  },
  "aggregations": {
    "count_is": {
      "doc_count_error_upper_bound": 0,
      "sum_other_doc_count": 0,
      "buckets": [
        {
          "key": 2532,
          "doc_count": 2
        }
      ]
    }
  }
}

或者您也可以扔掉过滤器,只运行聚合会返回所有categoryId及其外观数量。

POST index_name1111/_search
{
size: 0,
  "aggs": {
    "count_is": {
      "terms": {
        "field": "CategoryList.categoryId",
        "size": 10
      }
    }
  }
}

以上查询的回复

    {
      "took": 2,
      "timed_out": false,
      "_shards": {
        "total": 5,
        "successful": 5,
        "failed": 0
      },
      "hits": {
        "total": 3,
        "max_score": 0,
        "hits": []
      },
      "aggregations": {
        "count_is": {
          "doc_count_error_upper_bound": 0,
          "sum_other_doc_count": 0,
          "buckets": [
            {
              "key": 2532,
              "doc_count": 2
            },
            {


        "key": 1234,
          "doc_count": 1
        }
      ]
    }
  }
}

使用基数聚合,您将通过以下查询获得以下回复

POST index_name1111/_search
{
    "size": 0,
    "query": {
        "bool": {
            "must": [{
                "term": {
                    "CategoryList.categoryId": {
                        "value": 2532
                    }
                }
            }]
        }
    },
    "aggs": {
        "id_count": {
            "cardinality": {
                "field": "CategoryList.categoryId"
            }
        }
    }
}

上述查询的响应没有给出您想要的结果,因为两个文档都将categoryId与252匹配,因此distinct的计数为1.

{
  "took": 4,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
  },
  "hits": {
    "total": 2,
    "max_score": 0,
    "hits": []
  },
  "aggregations": {
    "id_count": {
      "value": 1
    }
  }
}

希望这会有所帮助 感谢