如何聚合elasticsearch中的动态字段?

时间:2014-07-17 10:04:51

标签: elasticsearch

我试图通过elasticsearch聚合动态字段(不同文档的不同)。文件如下:

[{
   "name": "galaxy note",
   "price": 123,
   "attributes": {
      "type": "phone",
      "weight": "140gm"
   }
},{
   "name": "shirt",
   "price": 123,
   "attributes": {
      "type": "clothing",
      "size": "m"
   }
}]

正如您所看到的,文档中的属性发生了变化。我想要实现的是聚合这些属性的字段,如下所示:

{
     aggregations: {
         types: {
             buckets: [{key: 'phone', count: 123}, {key: 'clothing', count: 12}]
         }
     }
}

我正在尝试弹性搜索的aggregation功能来实现这一目标,但无法找到正确的方法。是否有可能通过聚合实现?或者我应该开始寻找facets,认为它似乎被剥夺了。

2 个答案:

答案 0 :(得分:14)

您必须将属性定义为嵌套在映射中,并将单个属性值的布局更改为固定布局{ key: DynamicKey, value: DynamicValue }

PUT /catalog
{
  "settings" : {
    "number_of_shards" : 1
  },
  "mappings" : {
    "article": {
      "properties": {
        "name": { 
          "type" : "string", 
          "index" : "not_analyzed" 
        },
        "price": { 
          "type" : "integer" 
        },
        "attributes": {
          "type": "nested",
          "properties": {
            "key": {
              "type": "string"
            },
            "value": {
              "type": "string"
            }
          }
        }
      }  
    }
  }
}

您可以像这样索引您的文章

POST /catalog/article
{
  "name": "shirt",
  "price": 123,
  "attributes": [
    { "key": "type", "value": "clothing"},
    { "key": "size", "value": "m"}
  ]
}

POST /catalog/article
{
  "name": "galaxy note",
  "price": 123,
  "attributes": [
    { "key": "type", "value": "phone"},
    { "key": "weight", "value": "140gm"}
  ]
}

毕竟,您可以聚合嵌套属性

GET /catalog/_search
{
  "query":{
    "match_all":{}
  },     
  "aggs": {
    "attributes": {
      "nested": {
        "path": "attributes"
      },
      "aggs": {
        "key": {
          "terms": {
            "field": "attributes.key"
          },
          "aggs": {
            "value": {
              "terms": {
                "field": "attributes.value"
              }
            }
          }
        }
      }
    }
  }
}

然后以稍微不同的形式向您提供您所要求的信息

[...]
"buckets": [
  {
    "key": "type",
    "doc_count": 2,
    "value": {
      "doc_count_error_upper_bound": 0,
      "sum_other_doc_count": 0,
      "buckets": [
      {
        "key": "clothing",
        "doc_count": 1
      }, {
        "key": "phone",
        "doc_count": 1
      }
      ]
    }
  },
[...]

答案 1 :(得分:0)

不确定这是不是您的意思,但这对于基本的聚合功能来说相当简单。请注意,我没有包含映射,所以使用多个单词的类型会得到双重结果。

POST /product/atype
{
   "name": "galaxy note",
   "price": 123,
   "attributes": {
      "type": "phone",
      "weight": "140gm"
   }
}

POST /product/atype
{
   "name": "shirt",
   "price": 123,
   "attributes": {
      "type": "clothing",
      "size": "m"
   }
}

GET /product/_search?search_type=count
{
  "aggs": {
    "byType": {
      "terms": {
        "field": "attributes.type",
        "size": 10
      }
    }
  }
}