我已经配置了以下solr字段,并想看看如何从搜索中删除一些单词。对于例如薯条,油炸等。我尝试将它放在stopwords.txt中,但是没有工作solr仍然返回结果。 我的另一个问题是。
如何限制搜索,使得如果文本包含两个接近或分开的单词,则应返回匹配的结果,例如
如果我搜索虾poboy,它应该返回1和3而不是2。
poboy三明治类虾仁三明治
<fieldType name="text" class="solr.TextField" positionIncrementGap="100">
<analyzer type="index">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<!--tokenizer class="solr.KeywordTokenizerFactory"/-->
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.PatternReplaceFilterFactory" pattern="(;|,|-)\s*" replacement=" " replace="all"/>
<filter class="solr.PatternReplaceFilterFactory" pattern="^(\p{Punct}*)(.*?)(\p{Punct}*)$" replacement="$2"/>
<filter class="solr.WordDelimiterFilterFactory"
generateWordParts="1"
generateNumberParts="1"
catenateWords="1"
catenateNumbers="1"
catenateAll="0"
preserveOriginal="1"
/>
<filter class="solr.TrimFilterFactory"/>
<filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/>
<filter class="solr.SynonymFilterFactory" synonyms="synonyms.keyword.txt" ignoreCase="true" expand="true" tokenizerFactory="solr.KeywordTokenizerFactory"/>
<filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt"/>
<filter class="solr.KeywordMarkerFilterFactory" protected="protwords.txt"/>
<filter class="solr.PorterStemFilterFactory"/>
<filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<!--tokenizer class="solr.KeywordTokenizerFactory"/-->
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.PatternReplaceFilterFactory" pattern="(;|,|-)\s*" replacement=" " replace="all"/>
<filter class="solr.PatternReplaceFilterFactory" pattern="^(\p{Punct}*)(.*?)(\p{Punct}*)$" replacement="$2"/>
<filter class="solr.WordDelimiterFilterFactory"
generateWordParts="1"
generateNumberParts="1"
catenateWords="1"
catenateNumbers="1"
catenateAll="0"
preserveOriginal="1"
/>
<filter class="solr.TrimFilterFactory"/>
<filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/>
<filter class="solr.SynonymFilterFactory" synonyms="synonyms.keyword.txt" ignoreCase="true" expand="true" tokenizerFactory="solr.KeywordTokenizerFactory"/>
<filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt"/>
<filter class="solr.KeywordMarkerFilterFactory" protected="protwords.txt"/>
<filter class="solr.PorterStemFilterFactory"/>
<filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
</analyzer>
</fieldType>
答案 0 :(得分:1)
对于要从搜索中排除的字词,您需要在其中添加另一个过滤器:
<filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" enablePositionIncrements="true" />
如果仍然无效,请转到Solr管理面板,转到Analysis并尝试使用停用词查询。了解它是如何处理的。
对于第二个问题,Solr提供了邻近搜索 - 只需在查询后使用~2
来指定单词之间最多只能有两个单词。