Lucene SOLR没有在搜索结果中突出显示点,斜线等特殊字符

时间:2014-06-27 09:38:02

标签: apache solr lucene

就像在titile中一样,我从solr得到了结果,并且特殊字符没有在搜索词中突出显示

<em>00</em>:<em>00.000Z</em> 

solr参数

&hl.simple.pre=<em>&hl.simple.post=</em>

示例查询:all:*并将Hello / World作为

<em>Hello</em> / <em>World</em>

现场分析仪:

<fieldType name="text_en" class="solr.TextField" positionIncrementGap="100">
  <analyzer type="index">
    <tokenizer class="solr.StandardTokenizerFactory"/>
    <!-- in this example, we will only use synonyms at query time
    <filter class="solr.SynonymFilterFactory" synonyms="index_synonyms.txt" ignoreCase="true" expand="false"/>
    -->
    <!-- Case insensitive stop word removal.
      add enablePositionIncrements=true in both the index and query
      analyzers to leave a 'gap' for more accurate phrase queries.
    -->
    <filter class="solr.StopFilterFactory"
            ignoreCase="true"
            words="lang/stopwords_en.txt"
            enablePositionIncrements="true"
            />
    <filter class="solr.LowerCaseFilterFactory"/>
    <filter class="solr.EnglishPossessiveFilterFactory"/>
    <filter class="solr.KeywordMarkerFilterFactory" protected="protwords.txt"/>
<!-- Optionally you may want to use this less aggressive stemmer instead of PorterStemFilterFactory:
    <filter class="solr.EnglishMinimalStemFilterFactory"/>
-->
    <filter class="solr.PorterStemFilterFactory"/>
  </analyzer>
  <analyzer type="query">
    <tokenizer class="solr.StandardTokenizerFactory"/>
    <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/>
    <filter class="solr.StopFilterFactory"
            ignoreCase="true"
            words="lang/stopwords_en.txt"
            enablePositionIncrements="true"
            />
    <filter class="solr.LowerCaseFilterFactory"/>
    <filter class="solr.EnglishPossessiveFilterFactory"/>
    <filter class="solr.KeywordMarkerFilterFactory" protected="protwords.txt"/>
<!-- Optionally you may want to use this less aggressive stemmer instead of PorterStemFilterFactory:
    <filter class="solr.EnglishMinimalStemFilterFactory"/>
-->
    <filter class="solr.PorterStemFilterFactory"/>
  </analyzer>
</fieldType>

1 个答案:

答案 0 :(得分:0)

StandardTokenizer会清除像/这样的字符,所以你在这里寻找的内容实际上是WhitespaceTokenizer。除此之外,冒号:符号对于lucene查询解析器和edismax具有特殊的意义,所以也许你想用更简单但更健壮的dismax查询解析器来试试运气