ElasticSearch自动完成

时间:2014-07-22 08:18:21

标签: search lucene elasticsearch

我有四个文件名为"全名"。

文件:

  • 阿比盖尔哈里森
  • Abigale Hardison
  • Abilene Havington
  • 阿比林 - Havington

我想为这个领域做一个自动完成。一些例子:

搜索:" Abi" 结果:" Abigail Harrison"," Abigale Hardison"," Abilene Havington"

搜索:" Abig" 结果:" Abigail Harrison"," Abigale Hardison"

搜索:" Abigail Har" 结果:" Abigail Harrison"," Abigale Hardison"

搜索:" Abilene Hav" 结果:" Abilene Havington"," Abilene-Havington"

搜索:" Har" 结果:" Abigail Harrison"," Abigale Hardison"

我不想要这样的东西:(!)

搜索:" iga" 结果:" Abigail Harrison"," Abigale Hardison"

应该忽略空格和连字符,并且我希望将所有生成的标记设置为小写,因此搜索查询不应区分大小写。

我的ES设置如下。

{
"mappings": {
    "person": {
        "properties": {
            "fullname": {
                "index": "analyzed",
                "index_analyzer": "autocomplete",
                "search_analyzer": "standard",
                "type": "string"
            }
        }
    }
},
"settings": {
    "index": {
        "analysis": {
            "analyzer": {
                "autocomplete": {
                    "filter": [
                        "lowercase",
                        "edgengram"
                    ],
                    "tokenizer": "whitespace"
                }
            },
            "filter": {
                "edgengram": {
                    "max_gram": 50,
                    "min_gram": 3,
                    "type": "edgeNGram"
                }
            }
        }
    }
}

}

1 个答案:

答案 0 :(得分:1)

在建立索引时,您应该使用标准的tokenizer以及小写,asciifolding,suggestion_shingle,edgengram和搜索时使用关键字分析器。

尝试使用以下内容:

"index":{
"analysis": {
    "analyzer": {
        "autocomplete": {
            "tokenizer": "standard",
            "filter": [
                "lowercase",
                "asciifolding",
                "suggestions_shingle",
                "edgengram"
            ]
        }
    },
    "filter": {
        "suggestions_shingle": {
            "type": "shingle",
            "min_shingle_size": 2,
            "max_shingle_size": 5
        },
        "edgengram": {
            "type": "edgeNGram",
            "min_gram": 2,
            "max_gram": 30,
            "side": "front"
        }
    }
}
}

"mappings": {
    "person": {
        "properties": {
            "fullname": {
                "index": "analyzed",
                "index_analyzer": "autocomplete",
                "search_analyzer": "keyword",
                "type": "string"
            }
        }
    }
}

然后尝试使用匹配查询进行搜索。它应该可以解决你的问题。

由于