假设以下架构:
{
"document" : {
"properties" : {
"DocumentTitle" : {"type":"string", "index":"not_analyzed", "analyzer":"keyword", "store":true },
"ReceptionDate" : {"type":"date", "format":"yyyy-MM-dd HH:mm", "store":true }
}
}
}
我想要做的是按接收日期(因此5个最近的文档)获取TOP 5文档但是我希望它们按另一个字段(DocumentTitle)排序,因此只需按日期排序并限制为5个结果不是够了。
这可能通过1次查询或多次查询吗?
更新(根据Sidharthan请求):
我来自RDMS世界,这是一个非常常见的问题,使用TOP或group by语句解决。因此,我预计这将是一个简单的是/否响应,无论ElasticSearch是否支持此类功能(TOP)。
我在下面创建了演示数据,以帮助您更好地理解我的问题:
PUT http://localhost:9200/custom/
POST http://localhost:9200/custom/document/_mapping
POST data:
{
"document" : {
"properties":{
"DocumentTitle": { "type": "string", "store": true },
"ReceptionDate": { "type": "date", "format" : "yyyy-MM-dd'T'HH:mmZ", "store": true }
}
}
}
POST http://localhost:9200/custom/document/
POST data:
{
"DocumentTitle":"A.PDF",
"ReceptionDate":"2001-01-01T00:00+0000"
}
POST http://localhost:9200/custom/document/
POST data:
{
"DocumentTitle":"B.PDF",
"ReceptionDate":"2002-01-01T00:00+0000"
}
POST http://localhost:9200/custom/document/
POST data:
{
"DocumentTitle":"C.PDF",
"ReceptionDate":"2003-01-01T00:00+0000"
}
POST http://localhost:9200/custom/document/
POST data:
{
"DocumentTitle":"D.PDF",
"ReceptionDate":"2004-01-01T00:00+0000"
}
POST http://localhost:9200/custom/document/
POST data:
{
"DocumentTitle":"E.PDF",
"ReceptionDate":"2005-01-01T00:00+0000"
}
POST http://localhost:9200/custom/document/
POST data:
{
"DocumentTitle":"F.PDF",
"ReceptionDate":"2006-01-01T00:00+0000"
}
POST http://localhost:9200/custom/document/
POST data:
{
"DocumentTitle":"G.PDF",
"ReceptionDate":"2006-01-01T00:00+0000"
}
Sidharthan提案的结果是(我在帖子中使用URI搜索较短的尺寸):
GET http://localhost:9200/custom/document/_search?q=DocumentTitle:*&sort=ReceptionDate:desc,DocumentTitle:asc&fields=ReceptionDate,DocumentTitle&size=5&pretty=true
- 回应 -
{
"took" : 10,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"failed" : 0
},
"hits" : {
"total" : 7,
"max_score" : null,
"hits" : [ {
"_index" : "custom",
"_type" : "document",
"_id" : "v6gLeB9kSOCc5OgoTLT6BA",
"_score" : null, "_source" : {
"DocumentTitle":"F.PDF",
"ReceptionDate":"2006-01-01T00:00+0000"
},
"sort" : [ 1136073600000, "f.pdf" ]
}, {
"_index" : "custom",
"_type" : "document",
"_id" : "DJGivLtOQsW6DAGA5wgQzA",
"_score" : null, "_source" : {
"DocumentTitle":"G.PDF",
"ReceptionDate":"2006-01-01T00:00+0000"
},
"sort" : [ 1136073600000, "g.pdf" ]
}, {
"_index" : "custom",
"_type" : "document",
"_id" : "ic3v37xGQtydrjb-RaJl4g",
"_score" : null, "_source" : {
"DocumentTitle":"E.PDF",
"ReceptionDate":"2005-01-01T00:00+0000"
},
"sort" : [ 1104537600000, "e.pdf" ]
}, {
"_index" : "custom",
"_type" : "document",
"_id" : "kCcgoiodQKuxsD9n6ZGifw",
"_score" : null, "_source" : {
"DocumentTitle":"D.PDF",
"ReceptionDate":"2004-01-01T00:00+0000"
},
"sort" : [ 1072915200000, "d.pdf" ]
}, {
"_index" : "custom",
"_type" : "document",
"_id" : "jUYP0d3pSmSjlMqw3TsS1Q",
"_score" : null, "_source" : {
"DocumentTitle":"C.PDF",
"ReceptionDate":"2003-01-01T00:00+0000"
},
"sort" : [ 1041379200000, "c.pdf" ]
} ]
}
}
根据所包含的数据,这是完全正确的结果集。 但是它处于错误的订单。
我需要这些商品仅由DocumentTitle(C,D,E,F,G)订购。
除非ES支持某种TOP,否则我认为唯一的解决方案是获取ReceptionDate排序的结果集,然后按照kielni的建议在客户端手动进行排序。
答案 0 :(得分:0)
对这两个字段使用排序。首先按ReceptionDate
排序,然后按DocumentTitle
排序。
尝试
{
"sort": [
{
"ReceptionDate": {
"order": "desc"
}
},
{
"DocumentTitle": "asc"
}
],
"query": {
"term": {
"user": "kimchy"
}
},
"size": 5
}
答案 1 :(得分:0)
目前ES似乎无法做到这一点。
我将继续获取结果集并在客户端中应用订单。
谢谢大家。