使用Lucence 5.2.0(最新)使用StandardTokenizer,LowerCaseFilter和EdgeNgramFilter的索引文档

时间:2016-04-27 09:56:40

标签: indexing filter lucene tokenize

如何使用StandardTokenizer应用LowerCaseFilterEdgeNgramFilterlucene 5.2.0索引文档?

1 个答案:

答案 0 :(得分:1)

试试这个

<fieldType name="text" class="solr.TextField" positionIncrementGap="100">
          <analyzer type="index">
            <tokenizer class="solr.StandardTokenizerFactory"/>
            <filter class="solr.LowerCaseFilterFactory"/>
            <filter class="solr.EdgeNGramFilterFactory" minGramSize="1" maxGramSize="15" side="front" />
        </analyzer>
     </fieldType>

使用java

public TokenStream tokenStream(String fieldName,
                    Reader reader) {
                TokenStream result = new StandardTokenizer(reader);

                result = new StandardFilter(result);
                result = new LowerCaseFilter(result);
                result = new EdgeNGramTokenFilter(result, Side.FRONT,1,20);
                return result;
            }

检查此link