在给定的日期范围内选择雇员少于10张的员工

时间:2019-05-30 11:44:49

标签: elasticsearch

我的文档结构如下

{
   id:1,
   leaves:[
            {
               "reason":"",
               "date":"2019-01-01"
            },
            {
               "reason":"",
               "date":"2019-04-30"
            }
          ]
}

leaves是一个嵌套文档。文档结构可以更改。 我需要选择给定范围-2019-01-01至2019-05-30中少于10片叶子的员工。

我尝试了存储桶选择器聚合,但是“ min_bucket”存储桶路径未指向空存储桶(需要在范围内不存在任何叶子的地方)。我没有得到回应,也没有返回记录。

  "max_hourly_inner" : {
              "value" : null,
              "keys" : [ ]
            }

1 个答案:

答案 0 :(得分:1)

我想出了以下查询。在嵌套上执行聚合时有些棘手,但是您可以通过以下使用的聚合来实现。

我要解决的方程是向我显示指定日期范围内少于2片叶子的学生列表,即从2019-04-012019-05-30

样本文档:

// This student has 3 leaves over all and all 3 leaves in the specified date 
POST myleaves/_doc/1
{
  "id": 1001,
  "leaves" : [
    {
      "reason" : "",
      "date" : "2019-04-01"
    },
    {
      "reason" : "",
      "date" : "2019-04-29"
    },
    {
      "reason" : "",
      "date" : "2019-04-30"
    }
  ]
}

//This student has 4 leaves out of which 2 are in specified date range
POST myleaves/_doc/2
{
  "id": 1002,
  "leaves" : [
    {
      "reason" : "",
      "date" : "2019-04-01"
    },
    {
      "reason" : "",
      "date" : "2019-04-04"
    },
    {
      "reason" : "",
      "date" : "2019-07-29"
    },
    {
      "reason" : "",
      "date" : "2019-07-30"
    }
  ]
}

// This student has one leave but no leaves in specified date range
POST myleaves/_doc/3
{
  "id": 1003,
  "leaves":[
    {
      "reason" : "",
      "date" : "2019-07-29"
    }
  ]
}

//This student has no leaves at all
POST myleaves/_doc/4
{
  "id": 1004,
  "leaves":[

    ]
}

下面是聚合查询的结构

- Terms Aggregation on `id` field
 - Nested Aggregation on `leaves` field
  - Date Range aggregation on `leaves.date` field
   - Bucket Selector Aggregation on `count`. This is the part where we specify our condition 
 - Bucket Selector Aggregation to retrieve only documents having one bucket. (To avoid showing bucket with 0 doc counts) 

汇总查询:

POST <your_index_name>/_search
{  
   "size":0,
   "aggs":{  
      "mystudents":{  
         "terms":{  
            "field":"id",
            "size":10
         },
         "aggs":{  
            "mycount":{  
               "nested":{  
                  "path":"leaves"
               },
               "aggs": {
                 "valid_dates": {
                   "date_range": {
                     "field": "leaves.date",
                     "ranges": [
                       {
                         "from": "2019-04-01",
                         "to": "2019-05-30"
                       }
                     ]
                   },
                   "aggs": {
                     "myselector": {
                       "bucket_selector": {
                         "buckets_path": {
                           "myparams": "_count"
                         },
                         "script": "params.myparams <= 2"    <---- You may have to change this for less than 10 leaves params.myparams <=10
                       }
                     }
                   }
                 }
               }
            },
            "mybucket_selector":{  
               "bucket_selector":{  
                  "buckets_path":{  
                     "my_bucket_count":"mycount>valid_dates._bucket_count"
                  },
                  "script":"params.my_bucket_count == 1"
               }
            }
         }
      }
   }
}

请注意我在聚合查询中提到的评论。

聚集响应:

{
  "took" : 2,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 4,
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [ ]
  },
  "aggregations" : {
    "mystudents" : {
      "doc_count_error_upper_bound" : 0,
      "sum_other_doc_count" : 0,
      "buckets" : [
        {
          "key" : 1002,
          "doc_count" : 1,
          "mycount" : {
            "doc_count" : 4,                                 <----- Total Count of Leaves 
            "valid_dates" : {
              "buckets" : [
                {
                  "key" : "2019-04-01T00:00:00.000Z-2019-05-30T00:00:00.000Z",
                  "from" : 1.5540768E12,
                  "from_as_string" : "2019-04-01T00:00:00.000Z",
                  "to" : 1.5591744E12,
                  "to_as_string" : "2019-05-30T00:00:00.000Z",
                  "doc_count" : 2                            <------ Count of leaves in specified range
                }
              ]
            }
          }
        },
        {
          "key" : 1003,
          "doc_count" : 1,
          "mycount" : {
            "doc_count" : 1,
            "valid_dates" : {
              "buckets" : [
                {
                  "key" : "2019-04-01T00:00:00.000Z-2019-05-30T00:00:00.000Z",
                  "from" : 1.5540768E12,
                  "from_as_string" : "2019-04-01T00:00:00.000Z",
                  "to" : 1.5591744E12,
                  "to_as_string" : "2019-05-30T00:00:00.000Z",
                  "doc_count" : 0
                }
              ]
            }
          }
        },
        {
          "key" : 1004,
          "doc_count" : 1,
          "mycount" : {
            "doc_count" : 0,
            "valid_dates" : {
              "buckets" : [
                {
                  "key" : "2019-04-01T00:00:00.000Z-2019-05-30T00:00:00.000Z",
                  "from" : 1.5540768E12,
                  "from_as_string" : "2019-04-01T00:00:00.000Z",
                  "to" : 1.5591744E12,
                  "to_as_string" : "2019-05-30T00:00:00.000Z",
                  "doc_count" : 0
                }
              ]
            }
          }
        }
      ]
    }
  }
}

如果您查看回复,

  • 1001 没有出现 ,因为他在指定日期范围内有超过2片叶子,
  • 1002 出现,因为在指定的日期范围内,他在4片叶子中正好有2片叶子
  • 10031004 出现,因为它们没有在指定范围内离开任何叶子。

子句是在指定日期范围内精选的少于2片叶子的学生(包括没有叶子的学生)。

希望这会有所帮助!