不太确定如何说出这个标题。基本上,当我搜索'anim'时它会找到'动物',但是当我搜索'anima'时它找不到任何东西。然后,如果我搜索“动物”,它会再次找到“动物”......
有没有人有任何想法为什么它可能不适用于'anima'?这似乎发生在大多数单词中 - 但是在不同的字符中 - 例如'eleph'和'elephan'很好 - 但'elepha'不会返回任何东西。
以下是查询和结果:
查询1(好的)
/ solr的/选择FQ =类型:标签&安培; Q =名:动画
<response>
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">1</int>
<lst name="params">
<str name="fq">type:tag</str>
<str name="q">name:anim</str>
</lst>
</lst>
<result name="response" numFound="1" start="0">
<doc>
<int name="id">1</int>
<str name="name">Animals</str>
<arr name="name_auto">
<str>Animals</str>
<str>Animals</str>
</arr>
<date name="timestamp">2012-08-01T08:16:38.789Z</date>
<str name="type">tag</str>
<str name="unique_id">tag_1</str>
</doc>
</result>
</response>
查询2(不行)
/ solr的/选择FQ =类型:标签&安培; Q =名:灵魂
<response>
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">1</int>
<lst name="params">
<str name="fq">type:tag</str>
<str name="q">name:anima</str>
</lst>
</lst>
<result name="response" numFound="0" start="0"/>
</response>
查询3(好的)
/ solr的/选择FQ =类型:标签&安培; Q =名称:动物
<response>
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">0</int>
<lst name="params">
<str name="fq">type:tag</str>
<str name="q">name:animal</str>
</lst>
</lst>
<result name="response" numFound="1" start="0">
<doc>
<int name="id">1</int>
<str name="name">Animals</str>
<arr name="name_auto">
<str>Animals</str>
<str>Animals</str>
</arr>
<date name="timestamp">2012-08-01T08:16:38.789Z</date>
<str name="type">tag</str>
<str name="unique_id">tag_1</str>
</doc>
</result>
</response>
编辑1:
字段定义
<field name="name" type="text" indexed="true" stored="true" required="true" />
fieldType:
<fieldType name="text" class="solr.TextField" positionIncrementGap="100" autoGeneratePhraseQueries="true">
<analyzer type="index">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<!-- in this example, we will only use synonyms at query time
<filter class="solr.SynonymFilterFactory" synonyms="index_synonyms.txt" ignoreCase="true" expand="false"/>
-->
<!-- Case insensitive stop word removal.
add enablePositionIncrements=true in both the index and query
analyzers to leave a 'gap' for more accurate phrase queries.
-->
<filter class="solr.StopFilterFactory"
ignoreCase="true"
words="stopwords.txt"
enablePositionIncrements="true"
/>
<filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="1" catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.KeywordMarkerFilterFactory" protected="protwords.txt"/>
<filter class="solr.PorterStemFilterFactory"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/>
<filter class="solr.StopFilterFactory"
ignoreCase="true"
words="stopwords.txt"
enablePositionIncrements="true"
/>
<filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="0" catenateNumbers="0" catenateAll="0" splitOnCaseChange="1"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.KeywordMarkerFilterFactory" protected="protwords.txt"/>
<filter class="solr.PorterStemFilterFactory"/>
</analyzer>
</fieldType>
编辑2:
通过分析器传递字符串:
答案 0 :(得分:1)
安萨里是对的,问题是由于阻止。您发布的Solr架构证明了它,因为您正在使用PorterStemFilterFactory
。如果要搜索部分单词,可以尝试使用通配符查询,具体取决于您使用的查询解析器。如果你使用的是SOlr 3.x,它们可能太慢了,而使用Solr 4.x时,它已经得到了很大的改进。您可能想要EdgeNGrams,以便anima
也匹配animals
。