我的类型的字段为ISO 8601格式的时间数组。我想获取所有在某一天有时间的列表,然后在它们发生在该特定日期的最早时间之前对其进行排序。问题是我的查询是根据所有天的最早时间进行排序的。
您可以在下面重现该问题。
curl -XPUT 'localhost:9200/listings?pretty'
curl -XPOST 'localhost:9200/listings/listing/_bulk?pretty' -d '
{"index": { } }
{ "name": "second on 6th (3rd on the 5th)", "times": ["2018-12-05T12:00:00","2018-12-06T11:00:00"] }
{"index": { } }
{ "name": "third on 6th (1st on the 5th)", "times": ["2018-12-05T10:00:00","2018-12-06T12:00:00"] }
{"index": { } }
{ "name": "first on the 6th (2nd on the 5th)", "times": ["2018-12-05T11:00:00","2018-12-06T10:00:00"] }
'
# because ES takes time to add them to index
sleep 2
echo "Query listings on the 6th!"
curl -XPOST 'localhost:9200/listings/_search?pretty' -d '
{
"sort": {
"times": {
"order": "asc",
"nested_filter": {
"range": {
"times": {
"gte": "2018-12-06T00:00:00",
"lte": "2018-12-06T23:59:59"
}
}
}
}
},
"query": {
"bool": {
"filter": {
"range": {
"times": {
"gte": "2018-12-06T00:00:00",
"lte": "2018-12-06T23:59:59"
}
}
}
}
}
}'
curl -XDELETE 'localhost:9200/listings?pretty'
将上述脚本添加到.sh文件并运行它有助于重现该问题。您会看到订单是根据5号而不是6号进行的。 Elasticsearch将时间转换为epoch_millis
数以进行排序,您可以在hits对象的sort字段中看到纪元数,例如1544007600000。进行asc排序时,in采用数组中的最小数(顺序不重要) ),并以此为基础进行排序。
以某种方式,我需要在查询日(即6日)的最早时间订购此商品。
当前使用Elasticsearch 2.4,但是即使有人可以向我展示在当前版本中是如何完成的,这也很棒。
如果有帮助,这是他们在nested queries和scripting上的文件。
答案 0 :(得分:3)
我认为这里的问题是嵌套排序是针对嵌套对象而不是数组。
如果将文档转换为使用一组嵌套对象而不是简单的日期数组的文档,则可以构造一个有效的嵌套过滤排序。
以下是Elasticsearch 6.0-从6.1开始,它们对语法进行了一些更改,但我不确定在2.x中有多大的作用:
映射:
PUT nested-listings
{
"mappings": {
"listing": {
"properties": {
"name": {
"type": "keyword"
},
"openTimes": {
"type": "nested",
"properties": {
"date": {
"type": "date"
}
}
}
}
}
}
}
数据:
POST nested-listings/listing/_bulk
{"index": { } }
{ "name": "second on 6th (3rd on the 5th)", "openTimes": [ { "date": "2018-12-05T12:00:00" }, { "date": "2018-12-06T11:00:00" }] }
{"index": { } }
{ "name": "third on 6th (1st on the 5th)", "openTimes": [ {"date": "2018-12-05T10:00:00"}, { "date": "2018-12-06T12:00:00" }] }
{"index": { } }
{ "name": "first on the 6th (2nd on the 5th)", "openTimes": [ {"date": "2018-12-05T11:00:00" }, { "date": "2018-12-06T10:00:00" }] }
因此,我们有一个“ openTimes”嵌套对象,而不是“ nextNexpectionOpenTimes”,每个清单都包含一个openTimes数组。
现在搜索:
POST nested-listings/_search
{
"sort": {
"openTimes.date": {
"order": "asc",
"nested_path": "openTimes",
"nested_filter": {
"range": {
"openTimes.date": {
"gte": "2018-12-06T00:00:00",
"lte": "2018-12-06T23:59:59"
}
}
}
}
},
"query": {
"nested": {
"path": "openTimes",
"query": {
"bool": {
"filter": {
"range": {
"openTimes.date": {
"gte": "2018-12-06T00:00:00",
"lte": "2018-12-06T23:59:59"
}
}
}
}
}
}
}
}
这里的主要区别是查询稍有不同,因为您需要使用“嵌套”查询对嵌套对象进行过滤。
这将产生以下结果:
{
"took": 1,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 3,
"max_score": null,
"hits": [
{
"_index": "nested-listings",
"_type": "listing",
"_id": "vHH6e2cB28sphqox2Dcm",
"_score": null,
"_source": {
"name": "first on the 6th (2nd on the 5th)"
},
"sort": [
1544090400000
]
},
{
"_index": "nested-listings",
"_type": "listing",
"_id": "unH6e2cB28sphqox2Dcm",
"_score": null,
"_source": {
"name": "second on 6th (3rd on the 5th)"
},
"sort": [
1544094000000
]
},
{
"_index": "nested-listings",
"_type": "listing",
"_id": "u3H6e2cB28sphqox2Dcm",
"_score": null,
"_source": {
"name": "third on 6th (1st on the 5th)"
},
"sort": [
1544097600000
]
}
]
}
}
我认为您实际上不能从ES中的数组中选择一个值,因此对于排序,您总是要对所有结果进行排序。对于纯数组,您可以做的最好的事情就是选择如何处理该数组以进行排序(使用最低,最高,均值等)。