弹性搜索聚合组值

时间:2015-05-05 13:52:20

标签: elasticsearch

我的文档结构如下:

{
"title" : "A title",
"ExtraFields": [
    {
        "value": "print",
        "fieldID": "5535627631efa0843554b0ea"
    }
    ,
    {
        "value": "POLYE",
        "fieldID": "5535627631efa0843554b0ec"
    }
    ,
    {
        "value": "30",
        "fieldID": "5535627631efa0843554b0ed"
    }
    ,
    {
        "value": "0",
        "fieldID": "5535627631efa0843554b0ee"
    }
    ,
    {
        "value": "0",
        "fieldID": "5535627731efa0843554b0ef"
    }
    ,
    {
        "value": "0.42",
        "fieldID": "5535627831efa0843554b0f0"
    }
    ,
    {
        "value": "40",
        "fieldID": "5535627831efa0843554b0f1"
    }
    ,
    {
        "value": "30",
        "fieldID": "5535627831efa0843554b0f2"
    }
    ,
    {
        "value": "18",
        "fieldID": "5535627831efa0843554b0f3"
    }
    ,
    {
        "value": "24",
        "fieldID": "5535627831efa0843554b0f4"
    }
]
}

理想的输出是(最佳情况):

[
{
    "field" : "5535627831efa0843554b0f4",
    "values" : [
        {
            "label" : "24",
            "count" : 2
        },
        {
            "label" : "18",
            "count" : 5
        }
    ]
},
{
    "field" : "5535627831efa0843554b0f3",
    "values" : [
        {
            "label" : "cott",
            "count" : 20
        },
        {
            "label" : "polye",
            "count" : 12
        }
    ]
}
]

但是我也可以用更简单的一个(这就是我现在在mongodb中得到它):

[
{
    "field" : "5535627831efa0843554b0f4",
    "value" : "24",
    "count" : 2
},
{
    "field" : "5535627831efa0843554b0f4",
    "value" : "18",
    "count" : 5
},
{
    "field" : "5535627831efa0843554b0f3",
    "value" : "cott",
    "count" : 20
},
{
    "field" : "5535627831efa0843554b0f3",
    "value" : "polye",
    "count" : 12
}
] 

聚合查询如何?这个结构有什么特殊的映射吗?

1 个答案:

答案 0 :(得分:1)

为了获得您想要的内容,您需要nested子结构的ExtraFields映射。您的文档映射看起来像这样(doctype是我选择命名您的文档类型的术语,但它可以是您现在拥有的任何内容):

PUT /test/_mapping/doctype
{
  "doctype": {
    "properties": {
      "title": {
        "type": "string"
      },
      "ExtraFields": {
        "type": "nested",
        "properties": {
          "value": {
            "type": "string",
            "index": "not_analyzed"
          },
          "fieldID": {
            "type": "string",
            "index": "not_analyzed"
          }
        }
      }
    }
  }
}

然后,您可以索引文档

PUT /test/doctype/123
{
    "title" : "A title",
    "ExtraFields": [
       ...
    ]
}

并发送以下聚合查询:

POST /test/doctype/_search
{
  "size": 0,
  "aggs": {
    "fields": {
      "nested": {
        "path": "ExtraFields"
      },
      "aggs": {
        "fields": {
          "terms": {
            "field": "ExtraFields.fieldID"
          },
          "aggs": {
            "values": {
              "terms": {
                "field": "ExtraFields.value"
              }
            }
          }
        }
      }
    }
  }
}

这将产生您在最佳情况下突出显示的结果,尽管响应中的JSON字段名称命名有点不同,但我想它没问题。

尝试一下,让我们知道。