Question

我想能够自动完成名称。

例如，如果我们的名称为John Smith，我希望能够搜索Jo和Sm以及John Sm以获取该文档。

此外，我不希望jo sm与文档匹配。

我目前有这个分析仪：

return array(
    'settings' => array(
        'index' => array(
            'analysis' => array(
                'analyzer' => array(
                    'autocomplete' => array(
                        'tokenizer' => 'autocompleteEngram',
                        'filter' => array('lowercase', 'whitespace')
                    )
                ),

                'tokenizer' => array(
                    'autocompleteEngram' => array(
                        'type' => 'edgeNGram',
                        'min_gram' => 1,
                        'max_gram' => 50
                    )
                )
            )   
        )
    )
);

问题在于，首先我们将文本拆分，然后使用edgengrams进行标记。

这导致： j jo joh john s sm smi smit smith

这意味着，如果我搜索john smith或john sm，则不会返回任何内容。

所以，我需要生成看起来像这样的标记： j jo joh john s sm smi smit smith john s john sm john smi john smit john smith。

如何设置分析仪以便生成额外的令牌？

Answer 1

我最终没有使用edgengrams。

我使用standard标记器以及standard和lowercase过滤器创建了一个分析器。这实际上与standard分析器完全相同，但没有任何停用词过滤器（我们终于搜索名称，可能有人称为The或An等。）< / p>

然后我将上述分析器设置为index_analyzer，将simple设置为search_analyzer。将此设置与match_phrase_prefix查询一起使用效果非常好。

这是我使用的自定义分析器（称为自动完成并以PHP表示）：

'autocomplete' => array(
                        'tokenizer' => 'standard',
                        'filter' => array('standard', 'lowercase')
                ),

分析器自动完成名称

1 个答案: