我有一个solr安装来查询Drupal站点上的内容。许多标题字段在字符串的开头都有标点符号,所以当我按标题排序时,标点符号会出现在列表的顶部。
我想让solr在按标题排序时忽略标题,但我尝试过的解决方案都没有。
我对solr来说相当新,所以我做错了可能是非常简单的事情......我不太了解schema.xml文件中发生的事情!
标题字段在solr中称为标签,我在solr.PatternReplaceFilterFactory中尝试了各种不起作用的方法。
<field name="label" type="text" indexed="true" stored="true" termVectors="true" omitNorms="true"/>
<copyField source="label" dest="sort_label"/>
<fieldType name="text" class="solr.TextField" positionIncrementGap="100">
<analyzer type="index">
<charFilter class="solr.MappingCharFilterFactory" mapping="mapping-ISOLatin1Accent.txt"/>
<charFilter class="solr.HTMLStripCharFilterFactory"/>
<filter class="solr.PatternReplaceFilterFactory"
pattern="(^\p{Punct}+)" replacement="" replace="all"
/>
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.StopFilterFactory"
ignoreCase="true"
words="stopwords.txt"
enablePositionIncrements="true"
/>
<filter class="solr.WordDelimiterFilterFactory"
protected="protwords.txt"
generateWordParts="1"
generateNumberParts="1"
catenateWords="1"
catenateNumbers="1"
catenateAll="0"
splitOnCaseChange="0"
preserveOriginal="1"/>
<filter class="solr.LengthFilterFactory" min="2" max="100" />
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.SnowballPorterFilterFactory" language="English" protected="protwords.txt"/>
<filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
</analyzer>
<analyzer type="query">
…
</analyzer>
我的查询是 开始= 0&安培;行数= 25安培; Q =教育&安培; FL = ID%2Centity_id%2Centity_type%2Cbundle%2Cbundle_name%2Csort_label%2Css_language%2Cis_comment_count%2Cds_created%2Cds_changed%2Cscore%2Cpath%2Curl%2Cis_uid%2Ctos_name%2Czm_parent_entity%2Css_filemime%2Css_file_entity_title% 2Css_file_entity_url&安培; PF =含量%5E2.0&安培;&安培;排序= SORT_LABEL%20asc
答案 0 :(得分:1)
这是通过WordDelimiterFilterFactory
完成的。设置generateWordParts=1.
将此过滤器添加到您的
修改schema.xml
后重新启动服务器并重新索引数据。
<fieldType name="text" class="solr.TextField" positionIncrementGap="100">
<analyzer>
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.StopFilterFactory"
ignoreCase="true"
words="stopwords.txt"
enablePositionIncrements="true"
/>
<filter class="solr.WordDelimiterFilterFactory"
protected="protwords.txt"
generateWordParts="1"
generateNumberParts="1"
catenateWords="1"
catenateNumbers="1"
catenateAll="0"
splitOnCaseChange="0"
preserveOriginal="1"/>
<filter class="solr.LengthFilterFactory" min="2" max="100" />
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.SnowballPorterFilterFactory" language="English" protected="protwords.txt"/>
<filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
</analyzer>
</fieldType>