嵌套对象在ElasticSearch 6中使用过滤器进行聚合

时间:2018-06-09 05:31:45

标签: elasticsearch nest elasticsearch-aggregation

我有一组文件代表ElasticSearsh 6中的属性单位。每个属性都有嵌套的每周费率数组:

{
   "name" : "Completely Awesome Cabin"
   "rates" : [
      {
         "start": "2018-06-09T00:00:00",
         "end": "2018-06-16T00:00:00",
         "weeklyRate": 100.0,
      },
      {
         "start": "2018-06-16T00:00:00",
         "end": "2018-06-23T00:00:00",
         "weeklyRate": 200.0,
      }
      ...
   ]    
   ...
}

我正在通过几个选项执行一些过滤,包括日期。我需要添加聚合,这些聚合将为我提供通过过滤器的所有单元之间的最小和最大weeklyRate。我猜想它应该是某种带过滤器的嵌套聚合。我怎样才能做到这一点?

2 个答案:

答案 0 :(得分:1)

这是一个使用NEST 6.1.0运行的完整示例

private static void Main()
{
    var index = "default";
    var pool = new SingleNodeConnectionPool(new Uri("http://localhost:9200"));
    var connectionSettings = new ConnectionSettings(pool)
        .DefaultIndex(index);

    var client = new ElasticClient(connectionSettings);

    if (client.IndexExists(index).Exists)
        client.DeleteIndex(index);

    client.CreateIndex(index, c => c
        .Mappings(m => m
            .Map<MyDocument>(mm => mm
                .AutoMap()
                .Properties(p => p
                    .Nested<Rate>(n => n
                        .AutoMap()
                        .Name(nn => nn.Rates)
                    )
                )
            )
        )
    );

    client.Bulk(b => b
        .IndexMany(new[] {
            new MyDocument
            {
                Name = "doc 1",
                Rates = new []
                {
                    new Rate
                    {
                        Start = new DateTime(2018, 6, 9),
                        End = new DateTime(2018, 6, 16),
                        WeeklyRate = 100
                    },
                    new Rate
                    {
                        Start = new DateTime(2018, 6, 16),
                        End = new DateTime(2018, 6, 23),
                        WeeklyRate = 200
                    }
                }
            },
            new MyDocument
            {
                Name = "doc 2",
                Rates = new []
                {
                    new Rate
                    {
                        Start = new DateTime(2018, 6, 9),
                        End = new DateTime(2018, 6, 16),
                        WeeklyRate = 120
                    },
                    new Rate
                    {
                        Start = new DateTime(2018, 6, 16),
                        End = new DateTime(2018, 6, 23),
                        WeeklyRate = 250
                    }
                }
            }
        })
        .Refresh(Refresh.WaitFor)
    );

    var searchResponse = client.Search<MyDocument>(s => s
        // apply your filtering in .Query(...) e.g. applicable date range
        .Query(q => q.MatchAll())
        // don't return documents, just calculate aggregations
        .Size(0)
        .Aggregations(a => a
            .Nested("nested_start_dates", n => n
                .Path(f => f.Rates)
                .Aggregations(aa => aa
                    .DateHistogram("start_dates", dh => dh
                        .Field(f => f.Rates.First().Start)
                        .Interval(DateInterval.Day)
                        .MinimumDocumentCount(1)
                        .Aggregations(aaa => aaa
                            .Min("min_rate", m => m
                                .Field(f => f.Rates.First().WeeklyRate)
                            )
                            .Max("max_rate", m => m
                                .Field(f => f.Rates.First().WeeklyRate)
                            )
                        )
                    )
                )
            )
        )
    );

    var nested = searchResponse.Aggregations.Nested("nested_start_dates");

    var startBuckets = nested.DateHistogram("start_dates").Buckets;

    foreach(var start in startBuckets)
    {
        var min = start.Min("min_rate").Value;
        var max = start.Max("max_rate").Value;

        Console.WriteLine($"{start.KeyAsString} - min: {min}, max: {max}");
    }
}

public class MyDocument
{
    public string Name {get;set;}

    public IEnumerable<Rate> Rates {get;set;}
}

public class Rate
{
    public DateTime Start {get;set;}

    public DateTime End {get;set;}

    public double WeeklyRate {get;set;}
}

将以下内容打印到控制台

2018-06-09T00:00:00.000Z - min: 100, max: 120
2018-06-16T00:00:00.000Z - min: 200, max: 250

您可能还对其他指标汇总感兴趣,例如Stats Agggregation

答案 1 :(得分:0)

除了来自Russ Cam的非常有用且有帮助的答案之外,我想发布最终实现,这正是我所需要的。这是NEST聚合:

.Aggregations(a => a
    .Nested("budget_bound", n => n
        .Path(p => p.Rates)
        .Aggregations(aa => aa
            .Filter("by_start_date", fl => fl
                .Filter(fld => fld 
                    .DateRange(dr => dr
                        .Field(f => f.Rates.First().Start)
                        .GreaterThanOrEquals(checkIn)
                        .LessThanOrEquals(checkOut)))
                     .Aggregations(md => md
                         .Min("min_budget", m => m
                             .Field(f => f.Rates.First().WeeklyRate))
                         .Max("max_budget", m => m
                             .Field(f => f.Rates.First().WeeklyRate))
                      )
                 )
            )
       )

以下是相应的ES查询:

"aggs": {
"budget_bound": {
  "nested": {
    "path": "rates"
  },
  "aggs": {
    "by_start_date": {
      "filter": {
        "range": {
          "rates.start": {
            "gte": "2018-06-29T00:00:00+07:00", // parameter values
            "lte": "2018-07-06T00:00:00+07:00"  // parameter values
          }
        }
      },
      "aggs": {
        "min_budget": {
          "min": {
            "field": "rates.weeklyRate"
          }
        },
        "max_budget": {
          "max": {
            "field": "rates.weeklyRate"
          }
        }
      }
    }
  }
}}

对我来说,问题是要弄清楚如何在获得最小和最大聚合之前嵌套聚合以添加嵌套集合的过滤。