我的solr架构如下(仅限重要部分):
<fieldType name="bagofwords_expertfinding" class="solr.TextField" positionIncrementGap="100">
<analyzer type="index">
<!-- remove letters repeated more than two times -->
<charFilter class="solr.HTMLStripCharFilterFactory"/>
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.StopFilterFactory"
ignoreCase="true"
words="stopwords_en.txt"
enablePositionIncrements="true"
/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.EnglishPossessiveFilterFactory"/>
<filter class="solr.PatternReplaceFilterFactory" pattern="^[0-9-/_,\.]+$" replacement="" replace="all"/>
<filter class="solr.PatternReplaceFilterFactory" pattern="^.*(([aA-zZ])\\2)\\2+.*$" replacement=""/>
<filter class="solr.PorterStemFilterFactory"/>
<filter class="solr.LengthFilterFactory" min="3" max="100"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.StopFilterFactory"
ignoreCase="true"
words="stopwords_en.txt"
enablePositionIncrements="true"
/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.EnglishPossessiveFilterFactory"/>
<filter class="solr.PatternReplaceFilterFactory" pattern="^[0-9-/_,\.]+$" replacement="" replace="all"/>
<filter class="solr.PorterStemFilterFactory"/>
<filter class="solr.LengthFilterFactory" min="3" max="100"/>
</analyzer>
</fieldType>
<fieldType name="namedentities_expertfinding" class="solr.TextField" positionIncrementGap="100">
<analyzer type="index">
<!-- remove letters repeated more than two times -->
<charFilter class="solr.PatternReplaceCharFilterFactory" pattern="\s," replacement=","/>
<charFilter class="solr.PatternReplaceCharFilterFactory" pattern=",\s" replacement=","/>
<tokenizer class="solr.PatternTokenizerFactory" pattern="," />
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.StopFilterFactory"
ignoreCase="true"
words="stopwords_en.txt"
enablePositionIncrements="true"
/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.EnglishPossessiveFilterFactory"/>
<filter class="solr.PatternReplaceFilterFactory" pattern="^[0-9-/_,\.]+$" replacement="" replace="all"/>
<filter class="solr.LengthFilterFactory" min="3" max="100"/>
</analyzer>
</fieldType>
在身份证明方面,我编制了多个术语,例如:“diego alberto milito”,“diego armando maradona”。我正试图在两个字段中进行搜索,并使用dismax查询以不同的方式对其进行搜索。
但尝试使用此查询: localhost:8080 / solr / select /?q =“diego armando maradona”&amp; defType = dismax&amp; qf = namedentities ^ 100 bagofwords ^ 1&amp; fl = *,score&amp; debugQuery = true&amp; mm = 0
solr一无所获。也许我不明白正确使用“符号。
我也不明白这是来自solr wiki:
“在Solr 1.4和之前的版本中,如果你想要q.op = OR的等效值,你应该基本上设置mm = 0,如果你想要q.op = AND的等效值,你应该mm = 100%。在3.x和trunk的默认值mm由q.op param决定(q.op = AND =&gt; mm = 100%; q.op = OR =&gt; mm = 0%)。请记住默认运算符是受schema.xml条目的影响。在旧版本的Solr中,默认值为100%(所有子句必须匹配)“
并且在我的架构中,defaultOperator为OR,为什么在没有设置mm = 0的情况下,我获得默认的mm值为100。
提前致谢!
答案 0 :(得分:0)
在上面的查询字符串周围加上引号会强制执行短语查询。这意味着只考虑完全匹配。删除它们,替换为parens,并尝试使用pf和pf2以及pf3参数来增强更长的匹配短语。