我在elasticsearch中有一个聚合,它给出这样的响应:
{
"took": 5,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 1261,
"max_score": 0,
"hits": []
},
"aggregations": {
"clusters": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 1073,
"buckets": [
{
"key": 813058,
"doc_count": 46
},
{
"key": 220217,
"doc_count": 29
},
{
"key": 287763,
"doc_count": 23
},
{
"key": 527217,
"doc_count": 20
},
{
"key": 881778,
"doc_count": 15
},
{
"key": 700725,
"doc_count": 14
},
{
"key": 757602,
"doc_count": 13
},
{
"key": 467496,
"doc_count": 10
},
{
"key": 128318,
"doc_count": 9
},
{
"key": 317261,
"doc_count": 9
}
]
}
}
}
我想为聚合中的每个存储区获取一个文档(按最高分或随机-任何有效)。我该怎么办?
我用来获取聚合的查询是这样的:
GET myindex/_search
{
"size": 0,
"aggs": {
"clusters": {
"terms": {
"field": "myfield",
"size": 100000
}
}
},
"query": {
"bool": {
"must": [
{
"query_string": { "default_field": "field1", "query": "val1" }
},
{
"query_string": { "default_field": "field2", "query": "val2" }
}
]
}
}
}
我正在尝试实现基于聚类的句子相似度系统,因此我需要这样做。我从每个聚类中选择一个句子,然后检查与给定句子的相似性。
答案 0 :(得分:0)
我能够通过使用此处给出的热门匹配来解决此问题:https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-metrics-top-hits-aggregation.html
以下示例查询:
GET myindex/_search
{
"size": 0,
"aggs": {
"clusters": {
"terms": {
"field": "myfield",
"size": 100000
},
"aggs": {
"mydoc": {
"top_hits": {
"size" : 1
}
}
}
}
},
"query": {
"bool": {
"must": [
{
"query_string": { "default_field": "field1", "query": "val1" }
},
{
"query_string": { "default_field": "field2", "query": "val2" }
}
]
}
}
}