使用NEST和ElasticSearch每天计算文档

时间:2013-10-01 17:55:53

标签: elasticsearch nest

我正在尝试使用NEST实现ElasticSearch查询,以便能够生成键/值对的结果,其中键是日期,值是计数。我查询的文件都有一个created_date,我想计算每天插入的文件数量,过去7天或类似情况。

我已经检查了IElasticClient的Count方法,但这似乎给了我一个总数而不是每天。我想我需要对数据做一个方面,但不能完全弄清楚如何实现它。

任何帮助将不胜感激:)

2 个答案:

答案 0 :(得分:1)

你想要的是date_histogram - facet。

这是一个应该涵盖我可以解释你想要的各种方式的例子:

export ELASTICSEARCH_ENDPOINT="http://localhost:9200"

# Index documents
curl -XPOST "$ELASTICSEARCH_ENDPOINT/_bulk?refresh=true" -d '
{"index":{"_index":"play","_type":"type","_id":1}}
{"created_date":"2013-09-30T12:00:00Z","key":"foo","count":12}
{"index":{"_index":"play","_type":"type","_id":2}}
{"created_date":"2013-09-30T13:00:00Z","key":"bar","count":14}
{"index":{"_index":"play","_type":"type","_id":3}}
{"created_date":"2013-10-01T12:00:00Z","key":"foo","count":42}
{"index":{"_index":"play","_type":"type","_id":4}}
{"created_date":"2013-10-01T14:00:00Z","key":"foo","count":13}
'


# Do searches

curl -XPOST "$ELASTICSEARCH_ENDPOINT/_search?pretty" -d '
{
    "facets": {
        "foos_per_interval": {
            "date_histogram": {
                "key_field": "created_date",
                "value_field": "count",
                "interval": "day"
            },
            "facet_filter": {
                "term": {
                    "key": "foo"
                }
            }
        }
    }
}
'

{
  "took" : 1,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "failed" : 0
  },
  "hits" : {
    "total" : 4,
    "max_score" : 1.0,
    "hits" : [ {
      "_index" : "play",
      "_type" : "type",
      "_id" : "1",
      "_score" : 1.0, "_source" : {"created_date":"2013-09-30T12:00:00Z","key":"foo","count":12}
    }, {
      "_index" : "play",
      "_type" : "type",
      "_id" : "2",
      "_score" : 1.0, "_source" : {"created_date":"2013-09-30T13:00:00Z","key":"bar","count":14}
    }, {
      "_index" : "play",
      "_type" : "type",
      "_id" : "3",
      "_score" : 1.0, "_source" : {"created_date":"2013-10-01T12:00:00Z","key":"foo","count":42}
    }, {
      "_index" : "play",
      "_type" : "type",
      "_id" : "4",
      "_score" : 1.0, "_source" : {"created_date":"2013-10-01T14:00:00Z","key":"foo","count":13}
    } ]
  },
  "facets" : {
    "foos_per_interval" : {
      "_type" : "date_histogram",
      "entries" : [ {
        "time" : 1380499200000,
        "count" : 1,
        "min" : 12.0,
        "max" : 12.0,
        "total" : 12.0,
        "total_count" : 1,
        "mean" : 12.0
      }, {
        "time" : 1380585600000,
        "count" : 2,
        "min" : 13.0,
        "max" : 42.0,
        "total" : 55.0,
        "total_count" : 2,
        "mean" : 27.5
      } ]
    }
  }
}

答案 1 :(得分:1)

走向分面确实是要走的路:

http://www.elasticsearch.org/guide/reference/api/search/facets/date-histogram-facet/

public class Doc
{
    public string Id { get; set; }
    public DateTime CreatedOn { get; set; }
}

public void TempFacetExample()
{
    var result = this._client.Search<Doc>(s => s
        .FacetDateHistogram(fd => fd
            .OnField(p => p.CreatedOn)
            .Interval(DateInterval.Day)
            //global forces it to count out of scope
            //from the main query (if any)
            .Global()
            .FacetFilter(ff => ff
                .Range(rf => rf
                    .From(DateTime.UtcNow.AddDays(-7))
                    .To(DateTime.UtcNow)
                )
            )
        )
    );
    var facetBucket = result.Facet<DateHistogramFacet>(p => p.CreatedOn);
    //facetBucket.Items now holds the days with counts
    //if I remember correctly elasticsearch wont return empty buckets
    //so you have to handle missing days (that have no docs).

}