有人可以告诉我为什么这个弹性查询返回下面的结果。查询包含bool +必须的部分,该部分仅在字段nn中具有字符串“ softo”的完全匹配时才匹配。查询如下:
"query":{
"bool":{
"must":[
{"match":{"nn":"softo"}}
],
"should":[
{"match":{"nn":"sro"}},
{"match":{"nn":"as"}},
{"match":{"nn":"no"}},
{"match":{"nn":"vos"}},
{"match":{"nn":"ks"}}
]
}
}
它将返回一个结果,其中nn字段中没有软结果,
{
"_index": "search_2",
"_type": "doc",
"_id": "17053188",
"_score": 129.76167,
"_source": {
"nn": "zo soz kovo zts nova as zts elektronika as",
"nazov": "ZO SOZ KOVO,ZŤS NOVA a.s.,ZTS ELEKTRONIKA a.s.",
}
},
{
"_index": "search_2",
"_type": "doc",
"_id": "45732078",
"_score": 126.953285,
"_source": {
"nn": "agentura socialnych sluzieb ass no",
"nazov": "Agentúra sociálnych služieb - ASS n.o.",
}
}
我不明白。为什么它返回结果,例如“ zo soz kovo zts nova as zts elektronika as”,其中没有“ softo”字符串。 nn字段的映射如下:
{
"search_2": {
"aliases": {
"search": {}
},
"mappings": {
"doc": {
"dynamic": "strict",
"properties": {
"nn": {
"type": "text",
"boost": 10,
"analyzer": "autocomplete"
}
}
}
},
"settings": {
"index": {
"refresh_interval": "-1",
"number_of_shards": "4",
"provided_name": "search_2",
"creation_date": "1539693645683",
"analysis": {
"filter": {
"synonym_filter": {
"ignore_case": "true",
"type": "synonym",
"synonyms_path": "synonyms/sk_SK.txt"
},
"lemmagen_filter_sk": {
"type": "lemmagen",
"lexicon": "sk"
},
"stopwords_SK": {
"ignore_case": "true",
"type": "stop",
"stopwords_path": "stopwords/slovak.txt"
},
"remove_duplicities": {
"type": "unique",
"only_on_same_position": "true"
},
"autocomplete_filter": {
"type": "edge_ngram",
"min_gram": "2",
"max_gram": "20"
}
},
"analyzer": {
"autocomplete": {
"filter": [
"stopwords_SK",
"lowercase",
"stopwords_SK",
"autocomplete_filter"
],
"type": "custom",
"tokenizer": "standard"
},
"lower_ascii": {
"filter": [
"lowercase",
"asciifolding"
],
"type": "custom",
"tokenizer": "standard"
},
"suggestion": {
"filter": [
"stopwords_SK",
"lowercase",
"stopwords_SK",
"asciifolding"
],
"type": "custom",
"tokenizer": "standard"
}
}
},
"number_of_replicas": "1",
"uuid": "eyxXza0pQxWeQCpXih8ngg",
"version": {
"created": "6020399"
}
}
}
}
}
答案 0 :(得分:4)
由于autocomplete
字段上应用了nn
分析器,所以得到这些结果的原因。
我将根据以下字段进行说明:
"nn": "zo soz kovo zts nova as zts elektronika as"
上面生成的令牌将是:
zo, so, soz, ko, kov, kovo, zt, zts, no, nov, nova, as, zt, zts, el, ele, elek, elekt, elektr, elektro, elektro, elektroni, elektronik, elektronika, as
默认情况下,现在的匹配查询将同一分析器应用于搜索,并且标记之间的默认运算符为 OR 。因此{"match":{"nn":"softo"}}
实际上表现为
{
"match": {
"nn": "so OR sof OR soft OR softo"
}
}
如您所见,对于字段nn
,生成的令牌之一是so
,因此被匹配。
答案 1 :(得分:1)
您可以在必须查询中将“ match”更改为“ term”。
调用“ match”查询时,将计算该字段的分数。因此查询将回答问题“此字符串的匹配程度”。
调用“ term”查询时,不会计算分数。因此查询将回答一个简单的问题:是或否(匹配或不匹配)。
如果您确实需要全文搜索,则可以在“必须”查询中保留“匹配”并提高其得分。
例如,如果您想将其值增加5,则如下所示:
"must":[
{"match": {"nn": {"boost": 5, "query": "softo"}}}
]