Solr建议 - 如何将solr建议定义为不区分大小写

时间:2013-09-08 16:27:59

标签: solr autocomplete

我的建议(拼写检查程序)正在返回区分大小写的答案。 (我用它来自动完成 - 狗和狗返回不同的短语)\

我的建议定义如下 - 在solrconfig中 -

 <searchComponent class="solr.SpellCheckComponent" name="suggest">
<lst name="spellchecker">
    <str name="name">suggest</str>
    <str name="classname">org.apache.solr.spelling.suggest.Suggester</str>
    <str name="lookupImpl">org.apache.solr.spelling.suggest.tst.TSTLookup</str>
    <str name="field">suggest</str>  <!-- the indexed field to derive suggestions from -->
    <float name="threshold">0.005</float>
    <str name="buildOnCommit">true</str>
    <!--<str name="sourceLocation">american-english</str>-->
</lst>
</searchComponent>
<requestHandler class="org.apache.solr.handler.component.SearchHandler" name="/suggest">
    <lst name="defaults">
        <str name="spellcheck">true</str>
        <str name="spellcheck.dictionary">suggest</str>
        <str name="spellcheck.onlyMorePopular">true</str>
        <str name="spellcheck.count">5</str>
        <str name="spellcheck.collate">true</str>
    </lst>
    <arr name="components">
        <str>suggest</str>
    </arr>
</requestHandler>
架构中的

<field name="suggest" type="phrase_suggest" indexed="true" stored="true" required="false" multiValued="true"/>  

<copyField source="Name" dest="suggest"/>

<fieldtype name="phrase_suggest" class="solr.TextField">
  <analyzer>
    <tokenizer class="solr.KeywordTokenizerFactory"/>
    <filter class="solr.PatternReplaceFilterFactory"
            pattern="([^\p{L}\p{M}\p{N}\p{Cs}]*[\p{L}\p{M}\p{N}\p{Cs}\_]+:)|([^\p{L}\p{M}\p{N}\p{Cs}])+"
            replacement=" " replace="all"/>
    <filter class="solr.LowerCaseFilterFactory"/>
    <filter class="solr.TrimFilterFactory"/>
  </analyzer>
</fieldtype>

3 个答案:

答案 0 :(得分:0)

尝试更改添加到fieldType中的过滤工厂的顺序。另外,将LowerCaseFilterFactory放在列表顶部。

Shishir

答案 1 :(得分:0)

为此,您需要将字段类型添加到solrconfig.xml中的搜索组件声明中 在这种情况下&#34; phrase_suggestion&#34;但是匹配您在schema.xml中创建的任何已声明lowercasefilterfactory的字段类型。

<searchComponent class="solr.SpellCheckComponent" name="suggest">
    <lst name="spellchecker">
        <str name="name">suggest</str>
        <str name="classname">org.apache.solr.spelling.suggest.Suggester</str>
        <str name="lookupImpl">org.apache.solr.spelling.suggest.tst.TSTLookup</str>
        <str name="field">suggest</str>  <!-- the indexed field to derive suggestions from -->
        <float name="threshold">0.005</float>
        <str name="buildOnCommit">true</str>

        <!-- THIS IS THE LINE TO ADD -->
        <str name="suggestAnalyzerFieldType">phrase_suggest</str>

    </lst>
</searchComponent>

答案 2 :(得分:0)

实际上,正确的configparamter是“queryAnalyzerFieldType”,并且必须超出list元素,如下所示:

<searchComponent class="solr.SpellCheckComponent" name="suggest">
    <lst name="spellchecker">
        <str name="name">suggest</str>
        <str name="classname">org.apache.solr.spelling.suggest.Suggester</str>
        <str name="lookupImpl">org.apache.solr.spelling.suggest.tst.TSTLookup</str>
        <str name="field">suggest</str>  <!-- the indexed field to derive suggestions from -->
        <float name="threshold">0.005</float>
        <str name="buildOnCommit">true</str>

    </lst>
    <!-- Make it case-insensitive -->
    <str name="queryAnalyzerFieldType">text_general</str>
</searchComponent>

这适用于拼写纠正和建议。