用于索引和查询分析器的Solr SnowballPorterFilterFactory

时间:2012-05-15 20:40:55

标签: search solr snowball snowballanalyzer

我对索引和查询分析器使用 SnowballPorterFilterFactory 。 当我搜索“专业”字样时。 Solr成功地只找到包含“职业”的文章,但我想要“专业”“专业”......

这是 schema.xml

上的当前配置
<fieldType name="text" class="solr.TextField" positionIncrementGap="100">
    <analyzer type="index">
        <tokenizer class="solr.StandardTokenizerFactory"/>
        <filter class="solr.LowerCaseFilterFactory"/>
        <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="1" catenateNumbers="1" catenateAll="1" splitOnCaseChange="1"/>
        <filter class="solr.ASCIIFoldingFilterFactory" />
        <filter class="solr.SnowballPorterFilterFactory" language="French"/>
        <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt"/>

    </analyzer>
    <analyzer type="query">
        <tokenizer class="solr.StandardTokenizerFactory"/>
        <filter class="solr.LowerCaseFilterFactory"/>
        <filter class="solr.WordDelimiterFilterFactory" generateWordParts="0" generateNumberParts="0" catenateWords="0" catenateNumbers="0" catenateAll="0" splitOnCaseChange="0"/>
        <filter class="solr.ASCIIFoldingFilterFactory" />
        <filter class="solr.SnowballPorterFilterFactory" language="French"/>
        <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt"/>
    </analyzer>
</fieldType>

1 个答案:

答案 0 :(得分:0)

正在发生的事情是搬运工过度阻止您的查询。当您搜索profession时,您的关键字会被归结为profess,而profession professionalprofessionalism都会被存储在索引中profession

唯一真正能解决这个问题的方法是添加另一个fieldType来阻止查询。

类似的东西:

<fieldType name="text_unstemmed_query" class="solr.TextField" positionIncrementGap="100">
    <analyzer type="index">
        <tokenizer class="solr.StandardTokenizerFactory"/>
        <filter class="solr.LowerCaseFilterFactory"/>
        <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="1" catenateNumbers="1" catenateAll="1" splitOnCaseChange="1"/>
        <filter class="solr.ASCIIFoldingFilterFactory" />
        <filter class="solr.SnowballPorterFilterFactory" language="French"/>
        <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt"/>
    </analyzer>
    <analyzer type="query">
        <tokenizer class="solr.StandardTokenizerFactory"/>
        <filter class="solr.LowerCaseFilterFactory"/>
        <filter class="solr.WordDelimiterFilterFactory" generateWordParts="0" generateNumberParts="0" catenateWords="0" catenateNumbers="0" catenateAll="0" splitOnCaseChange="0"/>
        <filter class="solr.ASCIIFoldingFilterFactory" />
        <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt"/>
    </analyzer>
</fieldType>

使用如下的复制字段:

<copyField source="your_text_field" dest="text_unstem_query_field"/>