Question

我的用例需要使用尾随通配符查询我们的弹性搜索域。我希望在查询中了解处理此类通配符的最佳做法。

您认为添加以下条款是查询的一个好习惯：

"query" : { 
    "query_string" : { 
        "query" :   "attribute:postfix*",
        "analyze_wildcard" : true,
        "allow_leading_wildcard" : false,
        "use_dis_max" : false
    } 
}

我禁止使用领先的通配符，因为它是一个繁重的操作。但是我想从长远来看为每个查询请求分析通配符有多好。我的理解是，如果查询实际上没有任何通配符，则分析通配符不会产生任何影响。这是对的吗？

Answer 1

如果您可以更改映射类型和索引设置，那么正确的方法是创建一个edge-n-gram token filter的自定义分析器，它将索引attribute字段的所有前缀。< / p>

curl -XPUT http://localhost:9200/your_index -d '{
    "settings": {
        "analysis": {
            "filter": {
                "edge_filter": {
                    "type": "edgeNGram",
                    "min_gram": 1,
                    "max_gram": 15
                }
            },
            "analyzer": {
                "attr_analyzer": {
                    "type": "custom",
                    "tokenizer": "standard",
                    "filter": ["lowercase", "edge_filter"]
                }
            }
        }
    },
    "mappings": {
        "your_type": {
            "properties": {
                "attribute": {
                    "type": "string",
                    "analyzer": "attr_analyzer",
                    "search_analyzer": "standard"
                }
            }
        }
    }
}'

然后，当您为文档编制索引时，attribute字段值（例如）postfixing将被编入索引为以下标记：p，po，{{1 }}，pos，post，postf，postfi，postfix，postfixi，postfixin。

最后，您可以使用像这样的简单postfixing查询轻松地在attribute字段中查询postfix字段。无需在查询字符串查询中使用性能不佳的通配符。

match

如何在弹性搜索结构化查询中处理通配符

1 个答案: