我在Solr上有一个数据集合,我需要进行搜索并查找所有输入的单词。
例如,如果用户介绍文本" House Tree Spain" Solr应该寻找" House Tree Spain"," House Tree"," House Spain"," Tree Spain" " House"," Tree"," Spain"。
我正在使用" solr.ShingleFilterFactory"但就在我分析查询时。
<fieldType name="generic" class="solr.TextField" positionIncrementGap="100">
<analyzer type="index">
<tokenizer class="solr.StandardTokenizerFactory"/>
<!-- generic -->
<filter class="solr.ASCIIFoldingFilterFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" />
<!-- spanish -->
<filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_es.txt" />
<!-- english -->
<filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_en.txt" />
</analyzer>
<analyzer type="query">
<tokenizer class="solr.StandardTokenizerFactory"/>
<!-- generic -->
<filter class="solr.ASCIIFoldingFilterFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/>
<filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" />
<!-- spanish -->
<filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_es.txt" />
<!-- english -->
<filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_en.txt" />
<filter class="solr.ShingleFilterFactory" maxShingleSize="10" outputUnigramsIfNoShingles="true"/>
</analyzer>
</fieldType>
如何更改模式以获取我正在寻找的结果?
答案 0 :(得分:0)
您必须将Shingle过滤器应用于查询和索引分析器。在索引阶段,它创建了令牌&#34; House Tree&#34;和&#34; Tree Spain&#34;,并将它们放入索引中。在查询阶段,它会从查询中创建这些标记,并在索引中搜索它们。如果省略这些步骤中的任何一个,那么&#34; House Tree&#34;永远不会匹配,看?
PS。木瓦大小10是巨大的。对于此特定示例,您只需要2.尽可能低地设置它,否则,您的索引大小会变得非常大。