根据官方文档,我尝试使用三元词作为复合词:https://www.elastic.co/guide/en/elasticsearch/guide/current/ngrams-compound-words.html
但似乎我的索引完全忽略了minimum_should_match
参数。
index _settings的相关位:
{
"settings": {
"index": {
"analysis": {
"filter": {
"trigrams": {
"type": "ngram",
"min_gram": "3",
"max_gram": "3"
}
}
}
}
}
}
_mapping的相关位:
{
"myindex" : {
"mappings" : {
"story" : {
"properties" : {
"mediaObjects" : {
"type" : "nested",
"analyzer" : "standard",
"title" : {
"type" : "string",
"fields" : {
"trigram" : {
"type" : "string",
"analyzer" : "term_trigram"
}
}
}
}
}
}
}
}
}
然后这个查询:
{
"_source": ["mediaObjects.title"],
"query": {
"nested": {
"path": "mediaObjects",
"query": {
"match": {
"mediaObjects.title.trigram": {
"query": "rute",
"minimum_should_match": 2
}
}
}
}
}
}
现在我的理解是,这将导致三连词" rut"并且" ute",并且minimum_should_match
参数要求两者匹配(这是一个说明手头问题的简化示例)。上述查询的结果是:
{
"hits": {
"hits": [
{
"_type": "story",
"_id": "9c9cd49e-5ff0-4811-9cd4-9e5ff0d81167",
"_score": 10.104477,
"_source": {
"mediaObjects": [
{"title": "Gullruten"}
]
}
},
{
"_type": "story",
"_id": "7a4e6883-f532-49c5-8e68-83f532a9c554",
"_score": 2.8516788,
"_source": {
"mediaObjects": [
{"title": "Ulovlig skuter"}
]
}
},
{
"_type": "story",
"_id": "058ca4e5-4d41-4603-8ca4-e54d41e603b9",
"_score": 2.5049565,
"_source": {
"mediaObjects": [
{"title": "zz-uten-typenavn-210716"}
]
}
}
]
}
}
我的理解是,只有第一个真正属于这里。我已通过使用两个三元组和一个bool
手动构建minimum_number_should_match
查询来验证这一点。我做错了什么?
说明:
{
"hits": {
"total": 3,
"max_score": 10.104477,
"hits": [
{
"_shard": 0,
"_node": "nxQQqSo7RcqtRMvIsavm4A",
"_type": "story",
"_id": "9c9cd49e-5ff0-4811-9cd4-9e5ff0d81167",
"_score": 10.104477,
"_source": {
"mediaObjects": [
{
"title": "tittel"
},
{
"title": "Gullruten"
}
]
},
"_explanation": {
"value": 10.104477,
"description": "sum of:",
"details": [
{
"value": 10.104477,
"description": "Score based on child doc range from 0 to 1",
"details": []
},
{
"value": 0,
"description": "match on required clause, product of:",
"details": [
{
"value": 0,
"description": "# clause",
"details": []
},
{
"value": 0.098966025,
"description": "#*:* -_type:__*, product of:",
"details": [
{
"value": 1,
"description": "boost",
"details": []
},
{
"value": 0.098966025,
"description": "queryNorm",
"details": []
}
]
}
]
}
]
}
},
{
"_shard": 0,
"_node": "nxQQqSo7RcqtRMvIsavm4A",
"_type": "story",
"_id": "7a4e6883-f532-49c5-8e68-83f532a9c554",
"_score": 2.8516788,
"_source": {
"mediaObjects": [
{
"title": "Ulovlig skuter"
},
{
"title": "Ulovlig skuter"
}
]
},
"_explanation": {
"value": 2.8516788,
"description": "sum of:",
"details": [
{
"value": 2.8516788,
"description": "Score based on child doc range from 328 to 329",
"details": []
},
{
"value": 0,
"description": "match on required clause, product of:",
"details": [
{
"value": 0,
"description": "# clause",
"details": []
},
{
"value": 0.098966025,
"description": "#*:* -_type:__*, product of:",
"details": [
{
"value": 1,
"description": "boost",
"details": []
},
{
"value": 0.098966025,
"description": "queryNorm",
"details": []
}
]
}
]
}
]
}
},
{
"_shard": 3,
"_node": "dKU8Zq1JQUq7PQiOrw9d5g",
"_type": "story",
"_id": "058ca4e5-4d41-4603-8ca4-e54d41e603b9",
"_score": 2.5049565,
"_source": {
"mediaObjects": [
{
"title": "zz-uten-typenavn-210716"
}
]
},
"_explanation": {
"value": 2.5049565,
"description": "sum of:",
"details": [
{
"value": 2.5049565,
"description": "Score based on child doc range from 712 to 712",
"details": []
},
{
"value": 0,
"description": "match on required clause, product of:",
"details": [
{
"value": 0,
"description": "# clause",
"details": []
},
{
"value": 0.09091643,
"description": "#*:* -_type:__*, product of:",
"details": [
{
"value": 1,
"description": "boost",
"details": []
},
{
"value": 0.09091643,
"description": "queryNorm",
"details": []
}
]
}
]
}
]
}
}
]
}
}