Elasticsearch对嵌套的JSON数据进行聚合

时间:2019-07-09 15:05:24

标签: elasticsearch elasticsearch-aggregation elasticsearch-dsl-py

我必须对json数据进行一些汇总。我在这里看到了关于stackoverflow的多个答案,但对我没有任何帮助。 我有多行,在timeCountry列中,我有一个存储JSON对象的数组。带有密钥计数,country_name,s_name。

我必须根据s_name找到所有行的总和, 示例-如果在第一行中timeCountry持有如下所示的数组

[ {
      "count": 12,
      "country_name": "america",
      "s_name": "us"
    },
    {
      "count": 10,
      "country_name": "new zealand",
      "s_name": "nz"
    },
    {
      "count": 20,
      "country_name": "India",
      "s_name": "Ind"
    }]

第2行数据如下所示

[{
  "count": 12,
  "country_name": "america",
  "s_name": "us"
  },
  {
  "count": 10,
  "country_name": "South Africa",
  "s_name": "sa"
  },
  {
  "count": 20,
  "country_name": "india",
  "s_name": "ind"
  }]

像这样

我需要如下所示的结果

[{
        "count": 24,
        "country_name": "america",
        "s_name": "us"
    }, {
        "count": 10,
        "country_name": "new zealand",
        "s_name": "nz"
    },
    {
        "count": 40,
        "country_name": "India",
        "s_name": "Ind"
    }, {
        "count": 10,
        "country_name": "South Africa",
        "s_name": "sa"
    }
]

以上数据仅用于一行,我有多行timeCountry是列

我尝试为聚合编写的内容

{
   "query": {
      "match_all": {}
   },
   "aggregations":{
        "records" :{
            "nested":{
                "path":"timeCountry"
            },
            "aggregations":{
                "ids":{
                    "terms":{
                        "field": "timeCountry.country_name"
                    }
                }
            }
        }
   }

}

但是它不起作用,请帮助

1 个答案:

答案 0 :(得分:1)

我在本地弹性群集上尝试了此操作,并且能够在嵌套文档中获取聚合数据。根据您的索引映射,答案可能因我而异。以下是我尝试用于聚合的DSL:

{
    "aggs" : {
        "records" : {
            "nested" : {
                "path" : "timeCountry"
            },
            "aggs" : {
                "ids" : { "terms" : {
                    "field" : "timeCountry.country_name.keyword"
                },
               "aggs": {"sum_name": { "sum" : { "field" : "timeCountry.count" } } }
               }
            }
        }
    }
}

以下是我的索引的映射:

{
    "settings" : {
        "number_of_shards" : 1
    },
    "mappings": {
        "agg_data" : {
        "properties" : {
            "timeCountry" : {
                "type" : "nested"
            }
        }
    }
    }
}