Question

我有这个简单的映射：

PUT testindex
{
    "settings": {
        "analysis": {
            "analyzer": {
                "ngram_analyzer": {
                    "type": "custom",
                    "tokenizer": "standard",
                    "filter": ["lowercase", "edgeNGram"]
                }
            },
            "filter" : {
                "ngram" : {
                   "type": "edgeNGram",
                   "min_gram": 2,
                   "max_gram": 15
                }
            }
        }
    },
    "mappings": {
        "test": {
           "properties": {
               "name": {
                    "type": "string",
                    "analyzer" : "ngram_analyzer"
                }
            }
        }
    }
}

使用这些值：

PUT testindex/test/1 
{"name" : "Power"}
PUT testindex/test/2 
{"name" : "Pow"}
PUT testindex/test/3
{"name" : "PowerMax"}
PUT testindex/test/4
{"name" : "PowerRangers"}

搜索了这个：

GET testindex/test/_search
{
    "query": {
       "match": {
          "name": "Po"
       }
    }
}

得到了：

PowerRangers
Power
Pow
PowerMax

所有评分均为0.2876821

显然，最接近＆＃34; Po＆＃34;是＆＃34; Pow＆＃34;，我希望先收到;但我不是。

我应该如何通过这种逻辑修改我的映射？

Answer 1

我认为脚本排序是解决方案，但它带来了降低性能的缺点。有关此问题，请参阅here。您可以使用的查询是：

GET testindex/test/_search
{
  "query": {
    "match": {
      "name": "Po"
    }
  },
  "sort": {
    "_script": {
      "script": "_source['name'].value.length",
      "type": "number",
      "order": "asc"
    }
  }
}

搜索结果按搜索文本长度/匹配长度排序

1 个答案: