我正在尝试从包含多个单词的短语中实现自动完成功能。
我希望能够只匹配单词的开头(edgeNGram?),但是对于每个搜索的单词都匹配。
例如,如果我搜索“监视器”,我应该收到所有带有监视器字样的短语,但如果我搜索“onitor”,我就不会得到任何匹配(来自下面的数据集)。此外,搜索“mon ap”应该给我“APNEA MONITOR-SCHULTE Vital Signs Monitor”,例如“mon rrr”应该没有结果。
所以我的问题是我应该如何实施呢?
简而言之:匹配的短语应该包含以搜索的术语开头的单词。
这是我的映射:
{
"quicksearch2" : {
"results" : {
"properties" : {
"phrase" : {
"type" : "string",
"index_analyzer" : "quicksearch_index_analyzer",
"search_analyzer" : "quicksearch_search_analyzer"
}
}
}
}
}
以下是我的设置:
{
"quicksearch2" : {
"settings" : {
"index.analysis.analyzer.quicksearch_index_analyzer.filter.4" : "left_ngram",
"index.analysis.analyzer.quicksearch_search_analyzer.filter.3" : "unique",
"index.analysis.analyzer.quicksearch_index_analyzer.filter.3" : "unique",
"index.analysis.filter.left_ngram.max_gram" : "20",
"index.analysis.analyzer.quicksearch_search_analyzer.filter.2" : "asciifolding",
"index.analysis.analyzer.quicksearch_search_analyzer.tokenizer" : "keyword",
"index.analysis.analyzer.quicksearch_search_analyzer.filter.1" : "lowercase",
"index.number_of_replicas" : "0",
"index.analysis.analyzer.quicksearch_search_analyzer.filter.0" : "trim",
"index.analysis.filter.left_ngram.type" : "edgeNGram",
"index.analysis.analyzer.quicksearch_search_analyzer.type" : "custom",
"index.analysis.analyzer.quicksearch_index_analyzer.filter.0" : "trim",
"index.analysis.analyzer.quicksearch_index_analyzer.filter.2" : "asciifolding",
"index.analysis.analyzer.quicksearch_index_analyzer.filter.1" : "lowercase",
"index.analysis.analyzer.quicksearch_index_analyzer.type" : "custom",
"index.analysis.filter.left_ngram.side" : "front",
"index.analysis.analyzer.quicksearch_index_analyzer.tokenizer" : "keyword",
"index.number_of_shards" : "1",
"index.version.created" : "900899",
"index.uuid" : "Lb7vC-eHQB-u_Okm3ERLow"
}
}
}
这是我的问题:
query: {
match: {
phrase: {
query: term,
operator: 'and'
}
}
一些示例数据:
{
"took" : 133,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"failed" : 0
},
"hits" : {
"total" : 6197,
"max_score" : 1.491863,
"hits" : [ {
"_index" : "quicksearch2",
"_type" : "results",
"_id" : "emCydgTfQwuKkl4sSZoosQ",
"_score" : 1.491863,
"fields" : {
"phrase" : "APNEA MONITOR- SCHULTE Apnea Monitor"
}
}, {
"_index" : "quicksearch2",
"_type" : "results",
"_id" : "AXCO5rUxRwC9SebXcQxXeQ",
"_score" : 1.491863,
"fields" : {
"phrase" : "APNEA MONITOR- SCHULTE Apnea Monitor, Neonatal"
}
}, {
"_index" : "quicksearch2",
"_type" : "results",
"_id" : "tjJq3klPTsmP8akOc18Htw",
"_score" : 1.491863,
"fields" : {
"phrase" : "APNEA MONITOR- SCHULTE Apnea Monitor, Recording"
}
}, {
"_index" : "quicksearch2",
"_type" : "results",
"_id" : "-FjKWxl9Rm6-byn-wlpoIw",
"_score" : 1.491863,
"fields" : {
"phrase" : "APNEA MONITOR- SCHULTE Cardiorespiratory Monitor"
}
}, {
"_index" : "quicksearch2",
"_type" : "results",
"_id" : "Q19k6V6VQ6ulZOLCfESQ6w",
"_score" : 1.491863,
"fields" : {
"phrase" : "APNEA MONITOR- SCHULTE Impedance Pneumograph Bedside Monitor"
}
}, {
"_index" : "quicksearch2",
"_type" : "results",
"_id" : "YLI1er3cRjSyGumWNVi0pg",
"_score" : 1.491863,
"fields" : {
"phrase" : "APNEA MONITOR- SCHULTE Impedance Pneumograph Monitor"
}
}, {
"_index" : "quicksearch2",
"_type" : "results",
"_id" : "n5j1SaXeS2W6NymaYAYD6A",
"_score" : 1.491863,
"fields" : {
"phrase" : "APNEA MONITOR- SCHULTE Neonatal Monitor"
}
}, {
"_index" : "quicksearch2",
"_type" : "results",
"_id" : "U7Q5XrrHRbKOIwfRWO6RTQ",
"_score" : 1.491863,
"fields" : {
"phrase" : "APNEA MONITOR- SCHULTE Pulmonary Function Monitor"
}
}, {
"_index" : "quicksearch2",
"_type" : "results",
"_id" : "aF_THiCKRIyzunCbBxJTEg",
"_score" : 1.491863,
"fields" : {
"phrase" : "APNEA MONITOR- SCHULTE Vital Signs Monitor"
}
}, {
"_index" : "quicksearch2",
"_type" : "results",
"_id" : "8BAjZfwMQjWmrkqCO7o6gg",
"_score" : 1.491863,
"fields" : {
"phrase" : "P.P.M. - PORTABLE PRECISION MONITOR Gas Monitor, Atmospheric"
}
} ]
}
}
答案 0 :(得分:1)
我不太清楚为什么你所做的事情不起作用,但这是一种似乎可以做你想要的方法。
我用这些设置创建了一个索引:
curl -XPUT "http://localhost:9200/test_index " -d'
{
"settings": {
"analysis": {
"filter": {
"my_edge_ngram_filter": {
"type": "edgeNGram",
"min_gram": 2,
"max_gram": 20,
"token_chars": [
"letter",
"digit"
]
}
},
"analyzer": {
"my_ngram_analyzer": {
"type": "custom",
"tokenizer": "whitespace",
"filter": [
"lowercase",
"asciifolding",
"my_edge_ngram_filter"
]
},
"my_whitespace_analyzer": {
"type": "custom",
"tokenizer": "whitespace",
"filter": [
"lowercase",
"asciifolding"
]
}
}
}
},
"mappings": {
"docs": {
"properties": {
"phrase": {
"type": "string",
"index_analyzer": "my_ngram_analyzer",
"search_analyzer": "my_whitespace_analyzer"
}
}
}
}
}'
然后添加了您列出的文档:
curl -XPOST "http://localhost:9200/test_index/_bulk " -d'
{ "index" : { "_index" : "test_index", "_type" : "docs", "_id" : "1" } }
{ "phrase" : "APNEA MONITOR- SCHULTE Apnea Monitor" }
{ "index" : { "_index" : "test_index", "_type" : "docs", "_id" : "2" } }
{ "phrase" : "APNEA MONITOR- SCHULTE Apnea Monitor, Neonatal" }
{ "index" : { "_index" : "test_index", "_type" : "docs", "_id" : "3" } }
{ "phrase" : "APNEA MONITOR- SCHULTE Apnea Monitor, Recording" }
{ "index" : { "_index" : "test_index", "_type" : "docs", "_id" : "4" } }
{ "phrase" : "APNEA MONITOR- SCHULTE Cardiorespiratory Monitor" }
{ "index" : { "_index" : "test_index", "_type" : "docs", "_id" : "5" } }
{ "phrase" : "APNEA MONITOR- SCHULTE Impedance Pneumograph Bedside Monitor" }
{ "index" : { "_index" : "test_index", "_type" : "docs", "_id" : "6" } }
{ "phrase" : "APNEA MONITOR- SCHULTE Impedance Pneumograph Monitor" }
{ "index" : { "_index" : "test_index", "_type" : "docs", "_id" : "7" } }
{ "phrase" : "APNEA MONITOR- SCHULTE Neonatal Monitor" }
{ "index" : { "_index" : "test_index", "_type" : "docs", "_id" : "8" } }
{ "phrase" : "APNEA MONITOR- SCHULTE Pulmonary Function Monitor" }
{ "index" : { "_index" : "test_index", "_type" : "docs", "_id" : "9" } }
{ "phrase" : "APNEA MONITOR- SCHULTE Vital Signs Monitor" }
{ "index" : { "_index" : "test_index", "_type" : "docs", "_id" : "10" } }
{ "phrase" : "P.P.M. - PORTABLE PRECISION MONITOR Gas Monitor, Atmospheric" }
'
以下搜索似乎会返回您期望的结果:
curl -XPOST "http://localhost:9200/test_index/_search" -d'
{
"query": {
"match": {
"phrase" : {
"query": "monitor",
"operator": "and"
}
}
}
}'
返回所有文档,
curl -XPOST "http://localhost:9200/test_index/_search" -d'
{
"query": {
"match": {
"phrase" : {
"query": "onitor",
"operator": "and"
}
}
}
}'
不会返回任何内容,
curl -XPOST "http://localhost:9200/test_index/_search" -d'
{
"query": {
"match": {
"phrase" : {
"query": "mon ap",
"operator": "and"
}
}
}
}'
返回除文档"10"
之外的所有内容。
以下是您可以使用的可运行示例(您需要在localhost上安装并运行ES:9200,或提供另一个端点):http://sense.qbox.io/gist/19fdcdb20c24436c64b7656c3b8002fe78667b12
答案 1 :(得分:0)
将标记符(索引和搜索)从关键字更改为标准似乎已经成功了。