我一直试图用弹性搜索标记符来获取三元组。我已经按照http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/analysis-ngram-tokenizer.html和http://blog.qbox.io/multi-field-partial-word-autocomplete-in-elasticsearch-using-ngrams
上的教程进行了操作遵循这些文档并使用
测试分析仪 curl 'localhost:9200/test/_analyze?pretty=1&analyzer=my_ngram_analyzer' -d 'FC Schalke 04'
生成像# FC, Sc, Sch, ch, cha, ha, hal, al, alk, lk, lke, ke, 04
虽然我想要的是全字三卦
例如the quick red fox jumps over the lazy brown dog
的三元组将是。
the quick red
quick red fox
red fox jumps
fox jumps over
jumps over the
over the lazy
the lazy brown
lazy brown dog
简而言之,如何使用elasticsearch
创建上述图表答案 0 :(得分:3)
找到它。答案在于木瓦过滤器。这种映射使其有效
{
"settings": {
"analysis": {
"filter": {
"nGram_filter": {
"type": "shingle",
"max_shingle_size": 3,
"min_shingle_size": 3,
output_unigrams:false
}
},
"analyzer": {
"nGram_analyzer": {
"type": "custom",
"tokenizer": "whitespace",
"filter": [
"lowercase",
"asciifolding",
"nGram_filter"
]
},
"whitespace_analyzer": {
"type": "custom",
"tokenizer": "whitespace",
"filter": [
"lowercase",
"asciifolding"
]
}
}
}
}
}
这里的关键属性是type-> shingle和min / max shingle大小。