我有一个自动完成索引,可将挪威字母翻译成国际对应字典,例如æ在结果集中翻译成ae。我怎样才能让它返回挪威字母呢?
可以通过输入“eksot”https://norecopa.no/search来测试它。第二个结果将是“eksotiske kjaeledyr”,应该是“eksotiskekjæledyr”
这是索引的定义:
<fieldtype name="suggest_phrase" class="solr.TextField" positionIncrementGap="100">
<analyzer type="index">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
<charFilter class="solr.HTMLStripCharFilterFactory"/>
<filter class="solr.PatternReplaceFilterFactory"
pattern="(^[^A-Za-z0-9]*|[^A-Za-z0-9]*$)" replacement="" replace="all" />
<filter class="solr.LengthFilterFactory" min="1" max="60" />
<filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" />
<filter class="solr.EnglishMinimalStemFilterFactory"/>
<filter class="solr.ShingleFilterFactory" maxShingleSize="7" outputUnigrams="true" outputUnigramsIfNoShingles="true" />
<filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" />
<filter class="solr.EnglishMinimalStemFilterFactory"/>
<filter class="solr.ShingleFilterFactory" maxShingleSize="99" outputUnigrams="false" outputUnigramsIfNoShingles="true" />
</analyzer>
</fieldtype>
这是组件和respuest处理程序:
<searchComponent class="solr.SpellCheckComponent" name="suggest">
<lst name="spellchecker">
<str name="name">suggest</str>
<str name="classname">org.apache.solr.spelling.suggest.Suggester</str>
<str name="lookupImpl">org.apache.solr.spelling.suggest.tst.TSTLookupFactory</str>
<str name="field">text_sug</str> <!-- the indexed field to derive suggestions from -->
<float name="threshold">0.005</float>
<str name="buildOnCommit">true</str>
<str name="buildOnOptimize">true</str>
</lst>
</searchComponent>
<requestHandler class="org.apache.solr.handler.component.SearchHandler" name="/suggest">
<lst name="defaults">
<str name="spellcheck">true</str>
<str name="spellcheck.dictionary">text_suggester</str>
<str name="spellcheck.onlyMorePopular">true</str>
<str name="spellcheck.count">5</str>
<str name="spellcheck.collate">true</str>
</lst>
<arr name="components">
<str>suggest</str>
</arr>
</requestHandler>
答案 0 :(得分:0)
您正在使用TSTLookupFactory,这是最简单的实现。有several more。
其中之一,AnalyzingLookupFactory,使用单独的字段类型进行了额外的分析步骤。我相信如果您将角色映射移动到该步骤,您将匹配ascii表示但返回原始值。