Elasticsearch查询根据得分更改显示结果
当前查询按以下顺序给出字段标题的结果。
不应该 3.反而是第一结果?
此外,Foxs Quick Quick有两次出现的Quick,它在Queried结果中应该有一些偏好。但这即将到来的4点。
索引设置。
{
"fundraisers": {
"settings": {
"index": {
"number_of_shards": "5",
"provided_name": "fundraisers",
"creation_date": "1546515635025",
"analysis": {
"analyzer": {
"my_analyzer": {
"filter": [
"lowercase"
],
"tokenizer": "my_tokenizer"
},
"search_analyzer_search": {
"filter": [
"lowercase"
],
"tokenizer": "search_tokenizer_search"
}
},
"tokenizer": {
"my_tokenizer": {
"token_chars": [
"letter",
"digit"
],
"min_gram": "3",
"type": "edge_ngram",
"max_gram": "50"
},
"search_tokenizer_search": {
"token_chars": [
"letter",
"digit",
"whitespace"
],
"min_gram": "3",
"type": "ngram",
"max_gram": "50"
}
}
},
"number_of_replicas": "1",
"uuid": "mVweO4_sT3Ww00MzdLyavw",
"version": {
"created": "6020399"
}
}
}
}
}
Query
GET fundraisers/_search?explain=true
{
"query": {
"match_phrase": {
"title": {
"query": "qui",
"analyzer": "my_analyzer"
}
}
}
}
Mapping
{
"fundraisers": {
"mappings": {
"fundraisers": {
"properties": {
"status": {
"type": "text"
},
"suggest": {
"type": "completion",
"analyzer": "simple",
"preserve_separators": true,
"preserve_position_increments": true,
"max_input_length": 50
},
"title": {
"type": "text",
"analyzer": "my_analyzer"
},
"twitterUrl": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"videoLinks": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"zipCode": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
}
}
}
}
我是否使用match_phrase,搜索分析器和ngrams使其过于复杂,或者有没有更简单的方法来达到预期的结果?
参考: https://www.elastic.co/guide/en/elasticsearch/reference/6.5/query-dsl-match-query.html
答案 0 :(得分:0)
好吧,首先让我们创建一个最小且可重复的设置:
PUT test
{
"settings": {
"index": {
"number_of_shards": "1",
"number_of_replicas": "1",
"analysis": {
"analyzer": {
"my_analyzer": {
"filter": [
"lowercase"
],
"tokenizer": "my_tokenizer"
},
"search_analyzer_search": {
"filter": [
"lowercase"
],
"tokenizer": "search_tokenizer_search"
}
},
"tokenizer": {
"my_tokenizer": {
"token_chars": [
"letter",
"digit"
],
"min_gram": "3",
"type": "edge_ngram",
"max_gram": "50"
},
"search_tokenizer_search": {
"token_chars": [
"letter",
"digit",
"whitespace"
],
"min_gram": "3",
"type": "ngram",
"max_gram": "50"
}
}
}
}
},
"mappings": {
"_doc": {
"properties": {
"title": {
"type": "text",
"analyzer": "my_analyzer"
}
}
}
}
}
PUT test/_doc/1
{
"title": "Quick 123"
}
PUT test/_doc/2
{
"title": "Foxes Quick"
}
PUT test/_doc/3
{
"title": "Quick"
}
PUT test/_doc/4
{
"title": "Foxes Quick Quick"
}
PUT test/_doc/5
{
"title": "Quick Foxes"
}
然后让我们尝试最简单的查询:
GET test/_search
{
"query": {
"match": {
"title": {
"query": "qui"
}
}
}
}
现在您的订单是:
这几乎就是您所期望的,对吗?可能还有其他用例,但此查询未涵盖这些用例,但是IMO您必须使用multi_match
并在不同的分析器上进行搜索,因为我不确定Edgegram上的phrase_search
是否会感。