Solr Suggester组件不会返回非英语单词的匹配

时间:2013-02-24 12:10:39

标签: solr

我已经定义了一个这样的建议组件:

<searchComponent class="solr.SpellCheckComponent" name="suggest">
    <lst name="spellchecker">
        <str name="name">suggest</str>
        <str name="classname">org.apache.solr.spelling.suggest.Suggester</str>
        <str name="lookupImpl">org.apache.solr.spelling.suggest.tst.TSTLookup</str>

        <str name="field">autosuggest_general</str>
        <float name="threshold">0.005</float>
        <str name="buildOnCommit">true</str>
    </lst>
</searchComponent>
<requestHandler class="org.apache.solr.handler.component.SearchHandler" name="/suggest">
    <lst name="defaults">
        <str name="spellcheck">true</str>
        <str name="spellcheck.dictionary">suggest</str>
        <str name="spellcheck.onlymorepopular">true</str>
        <str name="spellcheck.count">5</str>
        <str name="spellcheck.collate">true</str>
    </lst>
    <arr name="components">
        <str>suggest</str>
    </arr>
</requestHandler>
像这样的

autosuggest_general字段:

<field name="autosuggest_general" type="autosuggest_type" indexed="true" stored="true" multiValued="true" />
<fieldType name="autosuggest_type" class="solr.TextField" positionIncrementGap="100">
    <analyzer>
        <charFilter class="solr.HTMLStripCharFilterFactory"/>
        <tokenizer class="solr.WhitespaceTokenizerFactory"/>
        <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="1" catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"/>
        <filter class="solr.LowerCaseFilterFactory"/>
    </analyzer>
</fieldType>

建议者组件不会返回任何非英语单词的点击 我希望自动完成单词Marcos

所以当我致电http://localhost:8983/solr/mycore/suggest?q=mar时,我得到以下回复:

<response>
    <lst name="responseHeader">
        <int name="status">0</int>
        <int name="QTime">2</int>
    </lst>
    <lst name="spellcheck">
        <lst name="suggestions"/>
    </lst>
</response>

常规搜索返回10次点击:
 http://localhost:8983/solr/mycore/select?q=autosuggest_general:marcos

对于de,我得到以下回复:

<response>
    <lst name="responseHeader">
        <int name="status">0</int>
        <int name="QTime">1</int>
    </lst>
    <lst name="spellcheck">
        <lst name="suggestions">
            <lst name="de">
                <int name="numFound">3</int>
                <int name="startOffset">0</int>
                <int name="endOffset">2</int>
                <arr name="suggestion">
                    <str>design</str>
                    <str>developer</str>
                    <str>development</str>
                </arr>
            </lst>
            <str name="collation">design</str>
        </lst>
    </lst>
</response>

designdeveloperdevelopment没问题,但我没有得到dejan的建议,autosuggest_general字段中确实存在该字。< / p>

http://localhost:8983/solr/mycore/select?q=autosuggest_general:dejan返回

<response>
    <lst name="responseHeader">
        <int name="status">0</int>
        <int name="QTime">1</int>
        <lst name="params">
            <str name="q">autosuggest_general:dejan</str>
        </lst>
    </lst>
    <result name="response" numFound="7" start="0">
    ...
    </result>
</response>

我正在使用Solr 4.1

非常感谢任何帮助!

1 个答案:

答案 0 :(得分:1)

这可能是一个问题:

<float name="threshold">0.005</float>

https://wiki.apache.org/solr/Suggester说:

threshold - threshold is a value in [0..1] representing the minimum fraction of documents (of the total) where a term should appear, in order to be added to the lookup dictionary.

尝试降低它,看看你是否得到了匹配。