elasticsearch在输入部分单词时不返回文本

时间:2016-05-02 00:11:50

标签: elasticsearch autocomplete

我的分析仪设置如下:

"analyzer": {
    "edgeNgram_autocomplete": {
        "type": "custom",
        "tokenizer": "standard",
        "filter": ["lowercase", "autocomplete"]
    },                
    "full_name": {
        "filter":["standard","lowercase","asciifolding"],
        "type":"custom",
        "tokenizer":"standard"
    }

我的过滤器:

"filter": {
    "autocomplete": {
        "type": "edgeNGram",
        "side":"front",
        "min_gram": 1,
        "max_gram": 50
    } 

名称字段分析器:

"textbox": {
    "_parent": {
        "type": "document"
    },            
    "properties": {
        "text": {
            "fields": {
                "text": {
                    "type":"string",
                    "analyzer":"full_name"
                },
                "autocomplete": {
                    "type": "string",
                    "index_analyzer": "edgeNgram_autocomplete",
                    "search_analyzer": "full_name",
                    "analyzer": "full_name"
                }
            },
            "type":"multi_field"
        }
    }
}

将所有内容放在一起,组成我对docstore索引的映射:

PUT http://localhost:9200/docstore
{
    "settings": {
        "analysis": {
            "analyzer": {
                "edgeNgram_autocomplete": {
                    "type": "custom",
                    "tokenizer": "standard",
                    "filter": ["lowercase", "autocomplete"]
                },                
                "full_name": {
                   "filter":["standard","lowercase","asciifolding"],
                   "type":"custom",
                   "tokenizer":"standard"
                }
            },
            "filter": {
                "autocomplete": {
                    "type": "edgeNGram",
                    "side":"front",
                    "min_gram": 1,
                    "max_gram": 50
                }           }
        }
    },
    "mappings": {
        "space": {
            "properties": {
                "name": {
                    "type": "string",
                    "index": "not_analyzed"
                }
            }
        },
        "document": {
            "_parent": {
                "type": "space"
            },
            "properties": {
                "name": {
                    "type": "string",
                    "index": "not_analyzed"
                }
            }
        },
        "textbox": {
            "_parent": {
                "type": "document"
            },            
            "properties": {
                "bbox": {
                    "type": "long"
                },
                "text": {
                    "fields": {
                        "text": {
                            "type":"string",
                            "analyzer":"full_name"
                        },
                        "autocomplete": {
                            "type": "string",
                            "index_analyzer": "edgeNgram_autocomplete",
                            "search_analyzer": "full_name",
                            "analyzer":"full_name"
                        }
                    },
                    "type":"multi_field"
                }
            }
        },
        "entity": {
            "_parent": {
                "type": "document"
            },
            "properties": {
                "bbox": {
                    "type": "long"
                },
                "name": {
                    "type": "string",
                    "index": "not_analyzed"
                }
            }
        }
    }
}

添加空格以容纳所有文档:

POST http://localhost:9200/docstore/space
{
    "name": "Space 1"
}

mapping

当用户输入单词:proj

这应该返回,所有文字:

  • SampleProject
  • 示例项目
  • 项目名称
  • myProjectname
  • firstProjectName
  • 我的ProjectName

但它什么也没有回报。

我的查询:

POST http://localhost:9200/docstore/textbox/_search
{
    "query": {
        "match": {
            "text": "proj"
        }
    },
    "filter": {
        "has_parent": {
            "type": "document",
            "query": {
                "term": {
                    "name": "1-a1-1001.pdf"
                }
            }
        }
    }
}

如果我按project搜索,我会:

{ "took": 4,
    "timed_out": false,
    "_shards": {
        "total": 5,
        "successful": 5,
        "failed": 0
    },
    "hits": {
        "total": 2,
        "max_score": 3.0133555,
        "hits": [
            {
                "_index": "docstore",
                "_type": "textbox",
                "_id": "AVRuV2d_f4y6IKuxK35g",
                "_score": 3.0133555,
                "_routing": "AVRuVvtLf4y6IKuxK33f",
                "_parent": "AVRuV2cMf4y6IKuxK33g",
                "_source": {
                    "bbox": [
                        8750,
                        5362,
                        9291,
                        5445
                    ],
                    "text": [
                        "Sample Project"
                    ]
                }
            },
            {
                "_index": "docstore",
                "_type": "textbox",
                "_id": "AVRuV2d_f4y6IKuxK35Y",
                "_score": 2.4106843,
                "_routing": "AVRuVvtLf4y6IKuxK33f",
                "_parent": "AVRuV2cMf4y6IKuxK33g",
                "_source": {
                    "bbox": [
                        8645,
                        5170,
                        9070,
                        5220
                    ],
                    "text": [
                        "Project Name and Address"
                    ]
                }
            }
        ]
    }
}

也许我的edgengram不适合这个? 我是说:

side":"front"

我应该采用不同的方式吗?

有谁知道我做错了什么?

2 个答案:

答案 0 :(得分:1)

您的查询实际上应该尝试匹配text.autocomplete而不是text

  "query": {
    "match": {
      "text.autocomplete": "proj"
    }
  }

答案 1 :(得分:1)

问题在于自动完成索引分析器字段名称。

变化:

"index_analyzer": "edgeNgram_autocomplete"

要:

"analyzer": "edgeNgram_autocomplete"

同样的搜索(@Andrei Stefan)在他的回答中显示:

POST http://localhost:9200/docstore/textbox/_search
{
    "query": {
        "match": {
            "text.autocomplete": "proj"
        }
    }
}

它会按预期工作!

我已在Elasticsearch 2.3

上测试了您的配置

顺便说一下,multi_field类型为deprecated

希望我能帮助:)