我使用elasticsearch来存储和检索数据。
curl http://localhost:9200/test/test -X POST -H "Content-type: application/json" -d '{"id":1, "created_at": "2015-03-02T12:00:00", "name": "test1"}'
curl http://localhost:9200/test/test/ -X POST -H "Content-type: application/json" -d '{"id":2, "created_at": "2015-03-03T12:00:00", "name": "test2"}'
curl http://localhost:9200/test/test/ -X POST -H "Content-type: application/json" -d '{"id":3, "created_at": "2015-03-03T12:00:00", "name": "test3"}'
curl http://localhost:9200/test/test/ -X POST -H "Content-type: application/json" -d '{"id":3, "created_at": "2015-03-03T12:01:00", "name": "test3"}'
curl http://localhost:9200/test/test/ -X POST -H "Content-type: application/json" -d '{"id":3, "created_at": "2015-03-03T12:02:00", "name": "test3"}'
curl http://localhost:9200/test/test/ -X POST -H "Content-type: application/json" -d '{"id":4, "created_at": "2015-03-02T12:00:00", "name": "test4"}'
curl http://localhost:9200/test/test/ -X POST -H "Content-type: application/json" -d '{"id":5, "created_at": "2015-03-02T12:00:00", "name": "test5"}'
curl http://localhost:9200/test/test/ -X POST -H "Content-type: application/json" -d '{"id":6, "created_at": "2015-03-03T12:00:00", "name": "test6"}'
当我尝试按created_at
分组时,它可以正常工作。
curl http://localhost:9200/test/test/_search -X POST -d '{"size": "0", "aggs": {"group_by_created_at":{"terms":{"field": "created_at"}}}}' | python -m json.tool
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 570 100 490 100 80 69900 11412 --:--:-- --:--:-- --:--:-- 81666
{
"_shards": {
"failed": 0,
"successful": 5,
"total": 5
},
"aggregations": {
"group_by_created_at": {
"buckets": [
{
"doc_count": 3,
"key": 1425297600000,
"key_as_string": "2015-03-02"
},
{
"doc_count": 5,
"key": 1425384000000,
"key_as_string": "2015-03-03"
},
{
"doc_count": 1,
"key": 1425384060000,
"key_as_string": "2015-03-03T12:01:00.000Z"
},
{
"doc_count": 1,
"key": 1425384120000,
"key_as_string": "2015-03-03T12:02:00.000Z"
}
]
}
},
"hits": {
"hits": [],
"max_score": 0.0,
"total": 8
},
"timed_out": false,
"took": 3
}
在上面的示例中,3条记录来自日期2015-03-03
,我想要计算一下。
输出就像。
{
"_shards": {
"failed": 0,
"successful": 5,
"total": 5
},
"aggregations": {
"group_by_created_at": {
"buckets": [
{
"doc_count": 3,
"key": 1425297600000,
"key_as_string": "2015-03-02"
},
{
"doc_count": 5,
"key": 1425384000000,
"key_as_string": "2015-03-03"
}
]
}
},
"hits": {
"hits": [],
"max_score": 0.0,
"total": 8
},
"timed_out": false,
"took": 3
}
我尝试用range
进行灌溉。
curl http://localhost:9200/test/test/_search -X POST -d '{"size": "0", "aggs": {"group_by_created_at":{"range":{"field": "created_at", "ranges": [{"gte": "2015-03-02T00:00:00", "lte": "2015-03-02T23:59:59"}, {"gte": "2015-03-03T00:00:00", "lte": "2015-03-03T23:59:59"}]}}}}' | python -m json.tool
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 446 100 230 100 216 37581 35294 --:--:-- --:--:-- --:--:-- 38333
{
"_shards": {
"failed": 0,
"successful": 5,
"total": 5
},
"aggregations": {
"group_by_created_at": {
"buckets": [
{
"doc_count": 8,
"key": "*-*"
},
{
"doc_count": 8,
"key": "*-*"
}
]
}
},
"hits": {
"hits": [],
"max_score": 0.0,
"total": 8
},
"timed_out": false,
"took": 2
}
但它显示了两个桶中的所有8个文档。如果我在过滤查询中使用相同的存储桶,则其工作正常。
curl http://localhost:9200/test/test/_search -X POST -d '{"query": {"filtered": {"filter":{"range":{"created_at" : {"gte": "2015-03-03T00:00:00", "lte": "2015-03-03T23:59:59"}}}}}}}' | python -m json.tool
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 994 100 870 100 124 110k 16105 --:--:-- --:--:-- --:--:-- 106k
{
"_shards": {
"failed": 0,
"successful": 5,
"total": 5
},
"hits": {
"hits": [
{
"_id": "mJs0WKiPTByQ6dLwJnKO8Q",
"_index": "test",
"_score": 1.0,
"_source": {
"created_at": "2015-03-03T12:00:00",
"id": 2,
"name": "test2"
},
"_type": "test"
},
{
"_id": "49a3pQX2TYa_KV029c0NLQ",
"_index": "test",
"_score": 1.0,
"_source": {
"created_at": "2015-03-03T12:02:00",
"id": 3,
"name": "test3"
},
"_type": "test"
},
{
"_id": "qWtAgCwSR_CTKsV1ibYVMg",
"_index": "test",
"_score": 1.0,
"_source": {
"created_at": "2015-03-03T12:01:00",
"id": 3,
"name": "test3"
},
"_type": "test"
},
{
"_id": "VoxSH6tXQmuugOVOmmrD2g",
"_index": "test",
"_score": 1.0,
"_source": {
"created_at": "2015-03-03T12:00:00",
"id": 6,
"name": "test6"
},
"_type": "test"
},
{
"_id": "oQmTxr5YRFaa3q7bvFOQLg",
"_index": "test",
"_score": 1.0,
"_source": {
"created_at": "2015-03-03T12:00:00",
"id": 3,
"name": "test3"
},
"_type": "test"
}
],
"max_score": 1.0,
"total": 5
},
"timed_out": false,
"took": 2
}
我遗失了一些东西,我不知道:(
答案 0 :(得分:2)
有一个date_histogram聚合,它将在任何给定的时间间隔内分组。要按日期分组,您可以使用:
"date_histogram":{
"field" : "created_at",
"interval" : "1d"
}