我试图在匹配全名的文档上获得更高的分数,而不是具有相同值的Edge NGram子集。
结果如下:
Pos Name _score _id
1 Baritone horn 7.56878 1786
2 Baritone ukulele 7.56878 2313
3 Bari 7.56878 2360
4 Baritone voice 7.56878 1787
我打算让第三个(" Bari")获得更高的分数,因为它是全名,但是,因为边缘ngram分解将使所有其他人完全具有& #34;巴里"索引的词。所以你可以在结果表上看到,所有人的分数是相等的,我甚至不知道弹性搜索是如何命令的,因为_id不是顺序的,也不是命令的名字。
我怎样才能做到这一点?
由于
{
"analysis": {
"filter": {
"edgeNGram_filter": {
"type": "edgeNGram",
"min_gram": 3,
"max_gram": 20,
"token_chars": [
"letter",
"digit",
"punctuation",
"symbol"
]
}
},
"analyzer": {
"edgeNGram_analyzer": {
"type": "custom",
"tokenizer": "whitespace",
"filter": [
"lowercase",
"asciifolding",
"edgeNGram_filter"
]
},
"whitespace_analyzer": {
"type": "custom",
"tokenizer": "whitespace",
"filter": [
"lowercase",
"asciifolding"
]
}
}
}
}
{
"name": {
"type": "string",
"index": "not_analyzed"
},
"suggest": {
"type": "completion",
"index_analyzer": "nGram_analyzer",
"search_analyzer": "whitespace_analyzer",
"payloads": true
}
}
POST /attribute-tree/attribute/_search
{
"query": {
"match": {
"suggest": "Bari"
}
}
}
(仅保留相关数据)
{
"took": 3,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 4,
"max_score": 7.56878,
"hits": [
{
"_index": "attribute-tree",
"_type": "attribute",
"_id": "1786",
"_score": 7.56878,
"_source": {
"name": "Baritone horn",
"suggest": {
"input": [
"Baritone",
"horn"
],
"output": "Baritone horn"
}
}
},
{
"_index": "attribute-tree",
"_type": "attribute",
"_id": "2313",
"_score": 7.56878,
"_source": {
"name": "Baritone ukulele",
"suggest": {
"input": [
"Baritone",
"ukulele"
],
"output": "Baritone ukulele"
}
}
},
{
"_index": "attribute-tree",
"_type": "attribute",
"_id": "2360",
"_score": 7.56878,
"_source": {
"name": "Bari",
"suggest": {
"input": [
"Bari"
],
"output": "Bari"
}
}
},
{
"_index": "attribute-tree",
"_type": "attribute",
"_id": "1787",
"_score": 7.568078,
"_source": {
"name": "Baritone voice",
"suggest": {
"input": [
"Baritone",
"voice"
],
"output": "Baritone voice"
}
}
}
]
}
}
答案 0 :(得分:3)
您可以使用bool
查询运算符及其should
子句将得分添加到完全匹配,如下所示:
POST /attribute-tree/attribute/_search
{
"query": {
"bool": {
"must": [
{
"match": {
"suggest": "Bari"
}
}
],
"should": [
{
"match": {
"name": "Bari"
}
}
]
}
}
}
在ElasticSearch definitive guide中,should子句中的查询被称为 signal 子句,这就是你如何区分完美匹配和ngram的匹配。您将拥有与must子句匹配的所有文档,但由于should
查询评分公式,匹配bool
个查询的文档将获得更多分数:
score = ("must" queries total score + matching "should" queries total score) / (total number of "must" queries and "should" queries)
结果是你所期望的,巴里是第一个结果(在得分方面遥遥领先:)):
"hits": {
"total": 3,
"max_score": 0.4339554,
"hits": [
{
"_index": "attribute-tree",
"_type": "attribute",
"_id": "2360",
"_score": 0.4339554,
"_source": {
"name": "Bari",
"suggest": {
"input": [
"Bari"
],
"output": "Bari"
}
}
},
{
"_index": "attribute-tree",
"_type": "attribute",
"_id": "1786",
"_score": 0.04500804,
"_source": {
"name": "Baritone horn",
"suggest": {
"input": [
"Baritone",
"horn"
],
"output": "Baritone horn"
}
}
},
{
"_index": "attribute-tree",
"_type": "attribute",
"_id": "2313",
"_score": 0.04500804,
"_source": {
"name": "Baritone ukulele",
"suggest": {
"input": [
"Baritone",
"ukulele"
],
"output": "Baritone ukulele"
}
}
}
]