我的索引中有一个简单的“text”字段。
"keywordName": {
"type": "text"
}
我已经插入了这些文件:“samsung”,“samsung galaxy”,“三星封面”,“三星充电器”。
如果我进行简单的“匹配”查询,结果会令人不安:
查询:
GET keywords/_search
{
"query": {
"match": {
"keywordName": "samsung"
}
}
}
结果:
{
"took": 7,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 4,
"max_score": 1.113083,
"hits": [
{
"_index": "keywords",
"_type": "keyword",
"_id": "samsung galaxy",
"_score": 1.113083,
"_source": {
"keywordName": "samsung galaxy"
}
},
{
"_index": "keywords",
"_type": "keyword",
"_id": "samsung charger",
"_score": 0.9433406,
"_source": {
"keywordName": "samsung charger"
}
},
{
"_index": "keywords",
"_type": "keyword",
"_id": "samsung",
"_score": 0.8405092,
"_source": {
"keywordName": "samsung"
}
},
{
"_index": "keywords",
"_type": "keyword",
"_id": "samsung cover",
"_score": 0.58279467,
"_source": {
"keywordName": "samsung cover"
}
}
]
}
}
第一个问题:为什么“三星”没有得分最高?
第二个问题:我怎样才能使查询或分析器给我“三星”作为最高分?
答案 0 :(得分:1)
从我previous reply中的相同索引设置(分析器,过滤器,映射)开始,我建议使用以下解决方案。但是,正如我所提到的,您需要根据您在此索引中搜索的内容来规定所有要求,并将所有这些要求视为一个完整的解决方案。
DELETE test
PUT test
{
"settings": {
"analysis": {
"analyzer": {
"custom_stop": {
"type": "custom",
"tokenizer": "standard",
"filter": [
"my_stop",
"my_snow",
"asciifolding"
]
}
},
"filter": {
"my_stop": {
"type": "stop",
"stopwords": "_french_"
},
"my_snow": {
"type": "snowball",
"language": "French"
}
}
}
},
"mappings": {
"test": {
"properties": {
"keywordName": {
"type": "text",
"analyzer": "custom_stop",
"fields": {
"raw": {
"type": "keyword"
}
}
}
}
}
}
}
POST /test/test/_bulk
{"index":{}}
{"keywordName":"samsung galaxy"}
{"index":{}}
{"keywordName":"samsung charger"}
{"index":{}}
{"keywordName":"samsung cover"}
{"index":{}}
{"keywordName":"samsung"}
GET /test/_search
{
"query": {
"bool": {
"should": [
{
"match": {
"keywordName": {
"query": "samsungs",
"operator": "and"
}
}
},
{
"term": {
"keywordName.raw": {
"value": "samsungs"
}
}
},
{
"fuzzy": {
"keywordName.raw": {
"value": "samsungs",
"fuzziness": 1
}
}
}
]
}
},
"size": 10
}