我一直在博客中阅读有关弹性搜索的建议,例如:https://www.elastic.co/blog/you-complete-me
但是你必须在name_suggest
数据中加入自己的数据,而不是在映射对象时自动将数据添加到name_suggest
。
所以更新此映射:
curl -X PUT localhost:9200/hotels -d '
{
"mappings": {
"hotel" : {
"properties" : {
"name" : { "type" : "string" },
"city" : { "type" : "string" },
"name_suggest" : {
"type" : "completion"
}
}
}
}
}'
以及这些看跌期权:
curl -X PUT localhost:9200/hotels/hotel/1 -d '
{
"name" : "Mercure Hotel Munich",
"city" : "Munich",
"name_suggest" : "Mercure Hotel Munich"
}'
curl -X PUT localhost:9200/hotels/hotel/2 -d '
{
"name" : "Hotel Monaco",
"city" : "Munich",
"name_suggest" : "Hotel Monaco"
}'
curl -X PUT localhost:9200/hotels/hotel/3 -d '
{
"name" : "Courtyard by Marriot Munich City",
"city" : "Munich",
"name_suggest" : "Courtyard by Marriot Munich City"
}'
因此我们可能会丢失name_suggest
字段。
因此,最终目标是当您开始输入Ho
时,第一个结果将是Hotel
答案 0 :(得分:0)
如果您希望在单词内部进行部分匹配,则可以使用ngrams;如果您只想从单词的开头匹配,则可以使用edge ngrams。
这是一个例子。我设置了这样一个索引:
PUT /test_index
{
"settings": {
"analysis": {
"filter": {
"edge_ngram_filter": {
"type": "edge_ngram",
"min_gram": 2,
"max_gram": 20
}
},
"analyzer": {
"edge_ngram_analyzer": {
"type": "custom",
"tokenizer": "standard",
"filter": [
"lowercase",
"edge_ngram_filter"
]
}
}
}
},
"mappings": {
"doc": {
"properties": {
"name": {
"type": "string",
"index_analyzer": "edge_ngram_analyzer",
"search_analyzer": "standard"
},
"city": {
"type": "string"
}
}
}
}
}
然后添加了您的文档:
POST /test_index/doc/_bulk
{"index":{"_id":1}}
{"name":"Mercure Hotel Munich","city":"Munich"}
{"index":{"_id":2}}
{"name":"Hotel Monaco","city":"Munich"}
{"index":{"_id":3}}
{"name":"Courtyard by Marriot Munich City","city":"Munich"}
现在,我可以查询名称中包含"hot"
的文档,如下所示:
POST /test_index/_search
{
"query": {
"match": {
"name": "hot"
}
}
}
我找回了正确的文档:
{
"took": 41,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"failed": 0
},
"hits": {
"total": 2,
"max_score": 0.625,
"hits": [
{
"_index": "test_index",
"_type": "doc",
"_id": "2",
"_score": 0.625,
"_source": {
"name": "Hotel Monaco",
"city": "Munich"
}
},
{
"_index": "test_index",
"_type": "doc",
"_id": "1",
"_score": 0.5,
"_source": {
"name": "Mercure Hotel Munich",
"city": "Munich"
}
}
]
}
}
有多种方法可以调整或推广。例如,如果要在多个字段上匹配,可以将ngram分析器应用于_all字段。
以下是我用来测试它的代码:
http://sense.qbox.io/gist/3583de02c4f7d33e07ba4c2def9badf90692a290