QA网站:snknop38we.azurewebsites.net/
查询示例: Solr:GETting'q =(111 AND(已发布:True)AND((entity_type_id:19))AND((available_start_date_time_utc:[*现在])或(: -available_start_date_time_utc:[* TO *] ]))AND((available_end_date_time_utc:[NOW TO ])或(:* -available_end_date_time_utc:[* TO *]))),start = 0,rows = 20,qf = name short_description published = true is_out_of_stock = false,hl = true,hl.fl = name,short_description'from'/ spell'
预期结果: VM11110xl Kramer
目前的结果:
名称&的方案类型简短描述字段
<fieldType name="text_general" class="solr.TextField" positionIncrementGap="100" multiValued="true">
<analyzer type="index">
<charFilter class="solr.MappingCharFilterFactory" mapping="mapping-FoldToASCII.txt"/>
<charFilter class="solr.HTMLStripCharFilterFactory"/>
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_ru.txt"/>
<filter class="solr.LowerCaseFilterFactory"/>
<!--<filter class="solr.SnowballPorterFilterFactory" language="Russian" protected="lang/protwords_lt.txt"/>-->
<filter class="solr.PorterStemFilterFactory"/>
</analyzer>
<analyzer type="query">
<charFilter class="solr.MappingCharFilterFactory" mapping="mapping-FoldToASCII.txt"/>
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_ru.txt"/>
<!--<filter class="solr.SynonymFilterFactory" synonyms="lang/synonyms_ru.txt" ignoreCase="true" expand="true"/>-->
<filter class="solr.LowerCaseFilterFactory"/>
<!--<filter class="solr.SnowballPorterFilterFactory" language="Russian" protected="lang/protwords_ru.txt"/>-->
<filter class="solr.PorterStemFilterFactory"/>
</analyzer>
</fieldType>
我们如何修改我们的方案以支持数字搜索? 我们也不想丢失当前的搜索功能
答案 0 :(得分:1)
主要问题是您希望匹配令牌的子字符串,因此根据您要实现的内容,向链中添加NGramFilter可以是一种解决方案。您必须调整值以获得您正在寻找的命中率,因为它也将匹配“110” - 具体取决于您构建数据的方式。
如果您只想匹配每个令牌的开头,您可以使用EdgeNgramfilter,也可以使用通配符搜索字符串(field:111*
)(但请记住,这可能会禁用其他部分令牌处理,所以在这种情况下你最好使用edgengramfilter)。
在这两种情况下,您只需要在编制索引时添加ngramfilter,而不是在查询时添加。
答案 1 :(得分:0)
使用以下架构:
<fieldType name="text_general" class="solr.TextField" positionIncrementGap="100" multiValued="true">
<analyzer type="index">
<charFilter class="solr.MappingCharFilterFactory" mapping="mapping-FoldToASCII.txt"/>
<charFilter class="solr.HTMLStripCharFilterFactory"/>
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_ru.txt"/>
<filter class="solr.LowerCaseFilterFactory"/>
<!--<filter class="solr.SnowballPorterFilterFactory" language="Russian" protected="lang/protwords_lt.txt"/>-->
<filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="1" catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"/>
<filter class="solr.PorterStemFilterFactory"/>
</analyzer>
<analyzer type="query">
<charFilter class="solr.MappingCharFilterFactory" mapping="mapping-FoldToASCII.txt"/>
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_ru.txt"/>
<!--<filter class="solr.SynonymFilterFactory" synonyms="lang/synonyms_ru.txt" ignoreCase="true" expand="true"/>-->
<filter class="solr.LowerCaseFilterFactory"/>
<!--<filter class="solr.SnowballPorterFilterFactory" language="Russian" protected="lang/protwords_ru.txt"/>-->
<filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="0" catenateNumbers="0" catenateAll="0" splitOnCaseChange="1"/>
<filter class="solr.PorterStemFilterFactory"/>
</analyzer>
</fieldType>
我使用过WordDelimiterFilterFactory。 它按照以下规则将单词拆分为子词。
来源:http://www.pathbreak.com/blog/solr-text-field-types-analyzers-tokenizers-filters-explained