ElasticSearch如何在和聚合中进行子聚合

时间:2017-11-08 09:43:56

标签: python elasticsearch aggregation

你好我在ElasticSearch中有一个索引: 植物,部门,日期,价值 我正在尝试在elasticsearch中进行查询

1)按特定部门的工厂和日期分组和总和值:

es = Elasticsearch('elasticsearch:9200')
body = Dict({"query": { 
                "bool": {
                    "must_not": {
                        "match": {
                            "Department": "Indirect*"}}}},
             "aggs": {
                "group_code": {
                    "terms": {
                        "field": "Plant.keyword", "size":10000},
                     "aggs": {
                        "group_date": {
                            "terms": {
                                "field": "Date"},
                             "aggs": {
                                "group_value": {
                                    "sum":{
                                       "field": "Value"}}}}}}}})

2)按植物和日期范围分组,得到平均值和中位数:

es = Elasticsearch('elasticsearch:9200')
body = Dict(
    {"query": {
            "bool": {
                "must_not": {
                    "match": {
                        "Department_Substrate": "Indirect*"}}}},
     "aggs": {
         "group_code": {
             "terms": {
                 "field": "Plant.keyword",
                 "size": 10000},
             "aggs": {
                 "group_date": {
                     "range": {
                         "field": "Date",
                         "ranges": datelist},
                     "aggs": {
                          "Median": {
                              "percentiles": {
                                  "field": "Value",
                                  "percents": [25]}},
                          "Mean": {
                               "avg": {
                                  "field":
                                  "Value}}}}}}}})

它也有效,但在这种情况下我没有按照工厂和日期进行分组,所以混合两者我有类似的东西:

body = Dict({"query": {
                "bool": {
                    "must_not": {
                        "match": {
                            "Department_Substrate": "Indirect*"}}}},
             "aggs": {
                "group_code": {
                    "terms": {
                        "field": "Plant.keyword", "size":10000},
                     "aggs": {
                        "group_date": {
                            "terms": {
                                "field": "Date"},
                             "aggs": {
                                "group_value": {
                                    "sum":{
                                       "field": "Value"},
                                    "aggs": {
                                        "group_date": {
                                            "range": {
                                                "field": "Date",
                                                "ranges": datelist},
                                            "aggs": {
                                                 "Median": {
                                                     "percentiles": {
                                                         "field": "Value",
                                                         "percents": [25]}},
                                                 "Mean": {
                                                      "avg": {
                                                         "field":
                                                         "Value"}}}}}}}}}}}})
res = es.search(index=self.index, doc_type='test', body=body)

我有这个:

TransportError: TransportError(500, 'aggregation_initialization_exception', 'Aggregator [group_value] of type [sum] cannot accept sub-aggregations')

所以有一种方法可以做到这一点吗?

如果它可以帮助我的代码python之前是:

data = test[~test.Department.str.startswith('Indirect')]
group1 = data.groupby(['Plant', 'Date'])['Value'].sum()
group2 = pd.DataFrame(group1.reset_index()).groupby(['Plant', pd.Grouper(key='Date', freq='W')])['Value'].median()

1 个答案:

答案 0 :(得分:0)

错误很明显:" [sum]类型的聚合器[group_value]不能接受子聚合" 当你做' sum'聚合你不能再分割结果了。 所以你最好改变总和aggs的位置。 即:

{
"query": {
    "bool": {
        "must_not": {
            "match": {
                "Department_Substrate": "Indirect*"
            }
        }
    }
},
"aggs": {
    "group_code": {
        "terms": {
            "field": "Plant.keyword",
            "size": 10000
        },
        "aggs": {
            "group_date": {
                "terms": {
                    "field": "Date"
                },
                "aggs": {
                    "group_date": {
                        "range": {
                            "field": "Date",
                            "ranges": "sdf"
                        },
                        "aggs": {
                            "Median": {
                                "percentiles": {
                                    "field": "Value",
                                    "percents": [
                                        25
                                    ]
                                }
                            },
                            "aggs": {
                                "group_value": {
                                    "sum": {
                                        "field": "Value"
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }