我的文档结构如下:
{'_id': 1, '_type': '2017-01-01',...} --- (1)
{'_id': 1, '_type': '2017-01-02',...} --- (2)
{'_id': 2, '_type': '2017-01-01',...} --- (3)
{'_id': 2, '_type': '2017-01-02',...} --- (4)
每个id可以属于不同的类型(这里是日期)。目的是获得与给定id列表匹配的文档具有最小值_type。
因此,对于id = [1,2]的输入,文档1&应返回3,因为它们具有_type(2017-01-01)的最小值。
这可以很容易地在每个id的基础上完成(通过查询给定id的最小可用类型),但这太贵了,因为给定的id列表在数千个范围内。
答案 0 :(得分:0)
The following query would do what i need:
{
"query": {
"bool": {
"must": [
{
"terms": {
"OpID": [id_list]
}
}
]
}
},
"aggs": {
"distinct_id": {
"terms": {
"field": "OpID",
"size": 100
},
"aggs": {
"least_type": {
"top_hits": {
"sort": [{
"as_of": {
"order": "asc"
}
}
],
"_source": {
"includes": ["as_of", "OpID"]
},
"size": 1
}
}
}
}
},
"size": 0
}
Explanation: The bool filter filters out the documents with the given ID list. The aggregation part gives top 1 document found after arranging documents in order of the given field (as_of in this case). The meaning of _source remains the same here.
The above solution worked for me as i had a field as_of which is of date type and had the copy of the value stored in _type. The above aggregation isn't possible on the field _type