如何在Spring启动时为Solr字段设置tokenizer?

时间:2017-11-10 13:50:47

标签: spring solr

我有solr文件:

@SolrDocument(solrCoreName = "mydocument")
public class MyDocument {


    @Indexed(name = "email", type = "text_general")
    private String email;

    ...
}

我想在Keyword Tokenizer为此字段设置标记生成器,因为我通过user@site*等电子邮件进行搜索时遇到问题。

我怎么能这样做?

1 个答案:

答案 0 :(得分:2)

Spring Boot无法控制这些设置(称为Schema

您已将字段email标记为email,字段类型为text_general。要使用此字段更改与索引/搜索过程相关的任何内容,您需要更新schema.xmlmanaged-schema(取决于您的设置)并更改fieldType定义。

以下是标准的text_general定义,您可以在此处solr.StandardTokenizerFactory更改为solr.KeywordTokenizerFactory(但是,我建议您为此特定电子邮件字段创建新的fieldType

<fieldType name="text_general" class="solr.TextField" positionIncrementGap="100"> 
  <analyzer type="index"> 
    <tokenizer class="solr.StandardTokenizerFactory"/>
    <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" />
    <!-- in this example, we will only use synonyms at query time
    <filter class="solr.SynonymFilterFactory" synonyms="index_synonyms.txt" ignoreCase="true" expand="false"/>
    -->
    <filter class="solr.LowerCaseFilterFactory"/>
  </analyzer>
  <analyzer type="query">
    <tokenizer class="solr.StandardTokenizerFactory"/>
    <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" />
    <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/>
    <filter class="solr.LowerCaseFilterFactory"/>
  </analyzer>
</fieldType>