SOLR同义词扩展不正确

时间:2014-03-11 22:49:30

标签: solr

我试图追踪一个奇怪的SOLR SynonymnFilterFactory问题。在具有此配置的Solr 4.6.1和4.7中:

    <fieldType name="text_buggy" class="solr.TextField">
        <analyzer type="index">
            <tokenizer class="solr.StandardTokenizerFactory"/>
            <filter class="solr.SynonymFilterFactory" 
                synonyms="synonyms.txt" ignoreCase="true" expand="true" />
            <filter class="solr.LowerCaseFilterFactory"/>
        </analyzer>
        <analyzer type="query">
            <tokenizer class="solr.StandardTokenizerFactory"/>
            <filter class="solr.LowerCaseFilterFactory"/>
        </analyzer>
    </fieldType>

使用这个synonyms.txt条目:

  wbc,白血球计数

字符串的字段分析输出&#34; due to elevated wbc patient was placed on medication&#34;在solr admin中显示以下内容:

ST | due | to | elevated | wbc         | patient         | was         | placed | on | medication
SF | due | to | elevated | wbc | white | patient | blood | was | count | placed | on | medication

但为什么它不像下面那样?由于上述原因,我得到了一些奇怪的搜索结果:

ST | due | to | elevated | wbc                         | patient | was | placed | on | medication
SF | due | to | elevated | wbc | white | blood | count | patient | was | placed | on | medication

更新 阅读this SOLR bug之后,我确实发现通过将我改为LUCENE_33,我得到了更好的结果(可能是我记得过去的结果):

ST | due | to | elevated | wbc                                         | patient | was | placed | on | medication
SF | due | to | elevated | wbc | white | white | count | blood | count | patient | was | placed | on | medication

0 个答案:

没有答案