Question

我是弹性搜索的新手。我想通过子串搜索，子串由数字和符号组成，如“/”和“ - ”。例如，我使用默认设置和一个索引字段创建索引：

curl -XPUT "http://localhost:9200/test/" -d ' {
    "mappings" : {
            "properties": {
                    "test_field": {
                            "type": "string"
                    }
            }
    }
} '

然后，我在索引中添加了一些数据：

curl -XPOST "http://localhost:9200/test/test_field" -d '{ "test_field" : "14/21-35" }'
curl -XPOST "http://localhost:9200/test/test_field" -d '{ "test_field" : "1/1-35" }'
curl -XPOST "http://localhost:9200/test/test_field" -d '{ "test_field" : "1/2-25" }'

刷新索引后，我执行搜索。所以，我想找到数据，其中“test_field”以“1/1”开头。我的要求：

curl -X GET "http://localhost:9200/test/_search?pretty=true" -d '{"query":{"query_string":{"query":"1/1*"}}}'

不返回任何命中。如果我删除了星号符号，那么作为回应我会看到两个点击：“1 / 1-35”和“1 / 2-25”。如果我尝试通过反斜杠（“1 \ / 1 *”）转义斜杠符号，则结果分别相同。

当我的查询中有“ - ”符号时，我必须逃避这个Lucene特殊字符。所以我发送下一个搜索请求：

curl -X GET "http://localhost:9200/test/_search?pretty=true" -d '{"query":{"query_string":{"query":"*1\-3*"}}}'

并返回解析错误。如果我双重逃避（“\\”）减去，那么我没有结果。

当查询包含这些字符时，我不知道搜索是如何执行的。也许我做错了什么。

我尝试在自定义分析器中使用 nGram 过滤器，但它不符合搜索引擎的要求。

如果有人遇到这个问题，请回答。

Answer 1

默认analyzer会在索引时删除数据中的所有特殊字符。您可以使用keyword analyzer或根本不在索引编制时分析数据：

curl -XPUT "http://localhost:9200/test/" -d ' {
    "mappings" : {
            "properties": {
                    "test_field": {
                            "type": "string",
                            "index": "not_analyzed"
                    }
            }
    }
} '

在弹性搜索中查找带有特殊字符的子字符串

1 个答案: