ElasticSearch / Elastica:搜索包含“和”或其他停用词的确切术语

时间:2013-04-15 03:36:27

标签: elasticsearch stop-words elastica

我正在尝试让ES QueryString与包含“and”的搜索词相匹配,但到目前为止我尝试的所有内容(尝试不同的分析器,令牌,过滤器)都没有用。在MySQL术语中,我想要的是:

WHERE field LIKE '%abbot and costello%'

我尝试了各种配置,这是我目前正在使用的(略有改进,因为它与“abbot”匹配(带有尾随空格),但仍然没有匹配“and”中的任何内容:

$eI->create(array(
    'analysis' => array(
        'analyzer' => array(
            'indexAnalyzer' => array(
                'type' => 'custom',
                'tokenizer' => 'SQLedgeNGram',
                'filter' => array(
                    'lowercase',
                ),
            ),
            'searchAnalyzer' => array(
                'type' => 'custom',
                'tokenizer' => 'SQLedgeNGram',
                'filter' => array(
                    'lowercase', 
                ),
            )
        ),
        'tokenizer' => array(
            'SQLedgeNGram' => array(
                'type' => 'edgeNGram',
                'min_gram' => 2,
                'max_gram' => 35,
                'side' => 'front'   
            ),
            'standardNoStop' => array(
                'type' => 'standard',
                'stopwords' => ''   
            )   
        )
    )
), true
);

这是我的测试用例字段值:

Abbott and Costello - Funniest Routines, Vol. 

尝试使用各种分析仪,我似乎无法匹配包含“和”的任何内容。

结果:

searching [abbot] 
 @       searchAnalyzer          total results: 1
 @       standard                total results: 1
 @       simple                  total results: 1
 @       whitespace              total results: 1
 @       keyword                 total results: 1


searching [abbot ] 
 @       searchAnalyzer          total results: 1
 @       standard                total results: 1
 @       simple                  total results: 1
 @       whitespace              total results: 1
 @       keyword                 total results: 1


searching [abbot and c] 
     searchAnalyzer          total results: 0
     standard                total results: 0
     simple                  total results: 0
     whitespace              total results: 0
     keyword                 total results: 0


searching [abbot and cost] 
     searchAnalyzer          total results: 0
     standard                total results: 0
     simple                  total results: 0
     whitespace              total results: 0
     keyword                 total results: 0


searching [abbot and costello] 
     searchAnalyzer          total results: 0
     standard                total results: 0
     simple                  total results: 0
     whitespace              total results: 0
     keyword                 total results: 0


searching [abbot costello] 
     searchAnalyzer          total results: 0
     standard                total results: 0
     simple                  total results: 0
     whitespace              total results: 0
     keyword                 total results: 0

1 个答案:

答案 0 :(得分:1)

您在查询中输入错误(在雅培中丢失第二个)。您也不需要在ngrams中运行搜索。搜索标记器可以是关键字,它仍然适用于短于35个字符的短语。顺便说一下,edgeNGram只会给你尾随的通配符。对于前导和尾随通配符,您需要使用nGram过滤器。