我有一个列表索引。每个都有与文档相关的权重。我需要能够搜索"工程师",并在" title"的每场比赛中获得最佳结果。基于与文档相关的相关性和任意权重。
示例索引:
Doc 1 {"title": "Java Engineer", "content": "A long description", "weighted_importance": 10}
Doc 2 {"title": "Search Engineer", "content": "A long description", "weighted_importance": 10}
Doc 3 {"title": "Ruby Engineer", "content": "A long description", "weighted_importance": 10}
Doc 4 {"title": "PHP Engineer", "content": "A long description", "weighted_importance": 10}
Doc 5 {"title": "Java Engineer", "content": "A long description", "weighted_importance": 10}
Doc 6 {"title": "Search Engineer", "content": "A long description", "weighted_importance": 100}
Doc 7 {"title": "Java Engineer", "content": "A long description", "weighted_importance": 100}
Doc 8 {"title": "MySQL Engineer", "content": "A long description", "weighted_importance": 10}
如果我正在寻找"工程师"将它用相同的标题重复删除项目,并通过增加weighted_importance字段返回结果集中的最佳结果,例如:
Doc 6 {"title": "Search Engineer", "content": "A long description", "weighted_importance": 100}
Doc 7 {"title": "Java Engineer", "content": "A long description", "weighted_importance": 100}
Doc 3 {"title": "Ruby Engineer", "content": "A long description", "weighted_importance": 10}
Doc 4 {"title": "PHP Engineer", "content": "A long description", "weighted_importance": 10}
Doc 8 {"title": "MySQL Engineer", "content": "A long description", "weighted_importance": 10}
最后三个结果会被排序但是它们会下降,但前两个结果需要在它们自己的桶中冒泡到表面。
我是ElasticSearch的新手,你可以说。任何帮助将不胜感激。
答案 0 :(得分:2)
尝试这种方法:
not_analyzed
版本的title
,以便根据完整标题构建存储桶,而不是根据形成标题的单个条款构建:{
"mappings": {
"engineers": {
"properties": {
"title": {
"type": "string",
"fields":{
"raw": {
"type": "string",
"index": "not_analyzed"
}
}
},
"content": {
"type": "string"
},
"weighted_importance": {
"type": "integer"
}
}
}
}
}
title.raw
top_hits
子聚合以带回" best"每个桶的文件top_hits
相同级别的另一个子聚合,该聚合应该是max
聚合,将计算最大weighted_importance
max
对生成的桶进行排序GET /my_index/engineers/_search?search_type=count
{
"query": {
"match": {
"title": "Engineer"
}
},
"aggs": {
"title": {
"terms": {
"field": "title.raw",
"order": {"best_hit":"desc"}
},
"aggs": {
"first_match": {
"top_hits": {
"sort": [{"weighted_importance": {"order": "desc"}}],
"size": 1
}
},
"best_hit": {
"max": {
"lang": "groovy",
"script": "doc['weighted_importance'].value"
}
}
}
}
}
}