在elasticsearch中,如何在嵌套数组

时间:2016-01-15 21:54:34

标签: elasticsearch

说,我有以下文件:

第一份文件:

{
  productName: "product1",
  tags: [
    {
      "name":"key1",
      "value":"value1"
    },
    {
      "name":"key2",
      "value":"value2"
    }
  ]
}

第二个文件:

{
  productName: "product2",
  tags: [
    {
      "name":"key1",
      "value":"value1"
    },
    {
      "name":"key2",
      "value":"value3"
    }
  ]
}

我知道如果我想按productName分组,我可以使用terms聚合

"terms": {
    "field": "productName"
}

这将给我两个带有两个不同键的桶" product1"," product2"。

但是,如果我想按标记键分组,那么查询应该是什么?即我想用标签分组名称== key1,然后我期待一个桶的密钥=" value1&#34 ;;如果我按标签分组名称== key2,我希望结果是两个带有键的桶#34; value2"," value3"。

如果我想按“'值”分组,查询应该是什么样子?在嵌套数组中但不是由'键组成?有什么建议吗?

1 个答案:

答案 0 :(得分:5)

这听起来像是nested术语汇总是您正在寻找的。

使用您发布的两个文档,此查询:

.carousel-inner>.item{
  min-height:350px;
}
@media screen and (max-width:767px){
  .carousel-inner>.item{
    min-height:250px;
  }
}
@media screen and (max-width:569px){
  .carousel-inner>.item{
    min-height:200px;
  }
}

返回:

POST /test_index/_search
{
   "size": 0,
   "aggs": {
      "product_name_terms": {
         "terms": {
            "field": "product_name"
         }
      },
      "nested_tags": {
         "nested": {
            "path": "tags"
         },
         "aggs": {
            "tags_name_terms": {
               "terms": {
                  "field": "tags.name"
               }
            },
            "tags_value_terms": {
               "terms": {
                  "field": "tags.value"
               }
            }
         }
      }
   }
}

以下是我用来测试它的一些代码:

http://sense.qbox.io/gist/a9a172f41dbd520d5e61063a9686055681110522

编辑:按嵌套值过滤

根据您的评论,如果您希望按(嵌套结果的)值过滤嵌套结果,则可以使用filter aggregation添加另一个聚合“层”,如下所示:

{
   "took": 67,
   "timed_out": false,
   "_shards": {
      "total": 5,
      "successful": 5,
      "failed": 0
   },
   "hits": {
      "total": 2,
      "max_score": 0,
      "hits": []
   },
   "aggregations": {
      "product_name_terms": {
         "doc_count_error_upper_bound": 0,
         "sum_other_doc_count": 0,
         "buckets": []
      },
      "nested_tags": {
         "doc_count": 4,
         "tags_name_terms": {
            "doc_count_error_upper_bound": 0,
            "sum_other_doc_count": 0,
            "buckets": [
               {
                  "key": "key1",
                  "doc_count": 2
               },
               {
                  "key": "key2",
                  "doc_count": 2
               }
            ]
         },
         "tags_value_terms": {
            "doc_count_error_upper_bound": 0,
            "sum_other_doc_count": 0,
            "buckets": [
               {
                  "key": "value1",
                  "doc_count": 2
               },
               {
                  "key": "value2",
                  "doc_count": 1
               },
               {
                  "key": "value3",
                  "doc_count": 1
               }
            ]
         }
      }
   }
}

返回:

POST /test_index/_search
{
   "size": 0,
   "aggs": {
      "nested_tags": {
         "nested": {
            "path": "tags"
         },
         "aggs": {
            "filter_tag_name": {
               "filter": {
                  "term": {
                     "tags.name": "key1"
                  }
               },
               "aggs": {
                  "tags_name_terms": {
                     "terms": {
                        "field": "tags.name"
                     }
                  },
                  "tags_value_terms": {
                     "terms": {
                        "field": "tags.value"
                     }
                  }
               }
            }
         }
      }
   }
}

这是更新后的代码:

http://sense.qbox.io/gist/507c3aabf36b8f6ed8bb076c8c1b8552097c5458