我对索引和查询分析器使用 SnowballPorterFilterFactory 。 当我搜索“专业”字样时。 Solr成功地只找到包含“职业”的文章,但我想要“专业”“专业”......
这是 schema.xml
上的当前配置<fieldType name="text" class="solr.TextField" positionIncrementGap="100">
<analyzer type="index">
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="1" catenateNumbers="1" catenateAll="1" splitOnCaseChange="1"/>
<filter class="solr.ASCIIFoldingFilterFactory" />
<filter class="solr.SnowballPorterFilterFactory" language="French"/>
<filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.WordDelimiterFilterFactory" generateWordParts="0" generateNumberParts="0" catenateWords="0" catenateNumbers="0" catenateAll="0" splitOnCaseChange="0"/>
<filter class="solr.ASCIIFoldingFilterFactory" />
<filter class="solr.SnowballPorterFilterFactory" language="French"/>
<filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt"/>
</analyzer>
</fieldType>
答案 0 :(得分:0)
正在发生的事情是搬运工过度阻止您的查询。当您搜索profession
时,您的关键字会被归结为profess
,而profession
professional
和professionalism
都会被存储在索引中profession
唯一真正能解决这个问题的方法是添加另一个fieldType
来阻止查询。
类似的东西:
<fieldType name="text_unstemmed_query" class="solr.TextField" positionIncrementGap="100">
<analyzer type="index">
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="1" catenateNumbers="1" catenateAll="1" splitOnCaseChange="1"/>
<filter class="solr.ASCIIFoldingFilterFactory" />
<filter class="solr.SnowballPorterFilterFactory" language="French"/>
<filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.WordDelimiterFilterFactory" generateWordParts="0" generateNumberParts="0" catenateWords="0" catenateNumbers="0" catenateAll="0" splitOnCaseChange="0"/>
<filter class="solr.ASCIIFoldingFilterFactory" />
<filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt"/>
</analyzer>
</fieldType>
使用如下的复制字段:
<copyField source="your_text_field" dest="text_unstem_query_field"/>