任何人都可以解释SOLR中的停止词是如何工作的。
在我的stopword.txt
中,我定义了of
。在schema.xml
我有
<filter class="solr.StopFilterFactory" ignoreCase="true"
words="stopwords.txt"enablePositionIncrements="true"/>
现在,当我搜索包含单词of
的任何内容时,结果中都没有显示。
示例: oil of olay
显示没有结果,oil olay
显示正确的结果。
更多文件定义:
<analyzer type="index">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.StopFilterFactory"
ignoreCase="true"
words="stopwords.txt"
enablePositionIncrements="true"
/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.WordDelimiterFilterFactory"
generateWordParts="1"
generateNumberParts="1"
catenateWords="1"
catenateNumbers="1"
catenateAll="1"
preserveOriginal="1"
splitOnCaseChange="0"
splitOnNumerics="0"
types="wdtypes.txt"
/>
<filter class="solr.KeywordRepeatFilterFactory"/>
<filter class="solr.EnglishMinimalStemFilterFactory"/>
<filter class="solr.TrimFilterFactory" updateOffsets="false"/>
<filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.StopFilterFactory"
ignoreCase="true"
words="stopwords.txt"
enablePositionIncrements="true"
/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.WordDelimiterFilterFactory"
generateWordParts="1"
generateNumberParts="1"
catenateWords="1"
catenateNumbers="1"
catenateAll="1"
preserveOriginal="1"
splitOnCaseChange="0"
splitOnNumerics="0"
types="wdtypes.txt"
/>
<filter class="solr.EnglishMinimalStemFilterFactory"/>
<filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
</analyzer>
调试时: +(upclist:+小麦&安培的奶油+; QT = productresults&安培;行数= 10安培; FQ =状态%3AActive&安培; FQ = facilitystatus%3AActive&安培; FQ = facilityid%3A100&安培; FQ = inventoryctrlcode%3A%5B0 + TO + 100%5D&安培; fq = weblifecycle%3A%283 + OR + 4%29&amp; fq = groupnumber%3A2 ^ 1.2 |关键词:cream + of + wheat&amp; qt = productresults&amp; rows = 10&amp; fq = status%3aactive&amp; fq = facilitystatus%3aactive&amp; fq = facilityid%3a100&amp; fq = inventoryctrlcode%3a%5b0 +至+ 100%5d&amp; fq = weblifecycle%3a%283 +或+ 4%29&amp; fq = groupnumber%3a2 ^ 20.0 | product_elevate:cream + of + wheat&amp; QT = productresults&安培;行数= 10安培; FQ =状态%3aactive&安培; FQ = facilitystatus%3aactive&安培; FQ = facilityid%3a100&安培; FQ = inventoryctrlcode%3A%5b0 +到+ 100%5D&安培; FQ = weblifecycle%3A%283 +或+ 4%29&amp; fq = groupnumber%3a2 ^ 5.0 | area:“(cream + of + wheat&amp; qt = productresults&amp; rows = 10&amp; fq = status%3aactive&amp; fq = facilitystatus%3aactive&amp; fq = facilityid%3a100&amp; fq =小麦qt productresul的inventoryctrlcode%3a%5b0 +至+ 100%5d&amp; fq = weblifecycle%3a%283 +或+ 4%29&amp; fq = groupnumber%3a2 cream) t(row creamofwheatqtproductresultsrow)10 fq status%3aactive fq facilitystatus%3aactive fq facilityid%3a100 fq inventoryctrlcode%3a%5b0(to fqstatus%3aactivefqfacilitystatus%3aactivefqfacilityid%3a100fqinventoryctrlcode%3a%5b0to)100%5d fq weblifecycle%3a%283(或fqweblifecycle %3a%283或者)4%29 fq(groupnumber%3a2 fqgroupnumber%3a2 creamofwheatqtproductresultsrows10fqstatus%3aactivefqfacilitystatus%3aactivefqfacilityid%3a100fqinventoryctrlcode%3a%5b0to100%5dfqweblifecycle%3a%283or4%29fqgroupnumber%3a2)“~3 ^ 2.5 |产品编号:+小麦&安培的奶油+; QT = productresults&安培;行数= 10安培; FQ =状态%3AActive&安培; FQ = facilitystatus%3AActive&安培; FQ = facilityid%3A100&安培; FQ = inventoryctrlcode%3A%5B0 + TO + 100%5D&安培; FQ = weblifecycle%3A%283 + OR + 4%29&amp; fq = groupnumber%3A2 ^ 1.7 |产品名称:+小麦&安培的奶油+; QT = productresults&安培;行数= 10安培; FQ =状态%3aactive&安培; FQ = facilitystatus%3aactive&安培; FQ = facilityid%3a100&安培; FQ = inventoryctrlcode%3A%5b0 +到+ 100%5D&安培; FQ = weblifecycle%3a%283 +或+ 4%29&amp; fq = groupnumber%3a2 ^ 10.0)~0.01()
答案 0 :(得分:0)
这可能不相关,因为你说你只搜索一个字段(无论如何我都在发帖,因为你说你使用的是edismax和qf)。当我想提高精确搜索时,我遇到了类似的问题,因此我将qf设为这样:<str name="qf">title^45 title_str^55
。标题字段使用了停用词,而title_str显然不是。 here描述了经常无法使用停用词找到搜索的原因。他们的解决方案是使用mm值。在我的情况下工作的解决方案是将title_str放在pf标签中(并从qf标签中删除),因此确切的查找将出现在顶部。
答案 1 :(得分:0)
最后通过更改此问题解决了这个问题:
“mm”从2 <-25%至2 <-36%