Lucene Tokenizer已弃用

时间:2015-05-28 04:49:27

标签: java lucene tokenize analyzer

以下Analyzer扩展包含许多已弃用的子类。什么是不弃用的替代品?适用于StandardTokenizerStandardFilterLowerCaseFilterStopFilter - 如下所示。

public class PorterAnalyzer extends Analyzer {

  private final Version version;

  public PorterAnalyzer(Version version) {
    this.version = version;
  }

  @Override
  @SuppressWarnings("resource")
  protected TokenStreamComponents createComponents(String fieldName, Reader reader) {
    final StandardTokenizer src = new StandardTokenizer(version, reader);
    TokenStream tok = new StandardFilter(version, src);
    tok = new LowerCaseFilter(version, tok);
    tok = new StopFilter(version, tok, StandardAnalyzer.STOP_WORDS_SET);
    tok = new PorterStemFilter(tok);
    return new TokenStreamComponents(src, tok);
  }

}

1 个答案:

答案 0 :(得分:0)

丢失版本参数。

我假设你使用的是Lucene版本4.10,或者那个附近的东西。具有所有这些版本参数的构造函数已被弃用(并从版本5.0开始删除),并替换为不接受该参数的构造函数。

public class PorterAnalyzer extends Analyzer {
  @Override
  @SuppressWarnings("resource")
  protected TokenStreamComponents createComponents(String fieldName, Reader reader) {
    final StandardTokenizer src = new StandardTokenizer(reader);
    TokenStream tok = new StandardFilter(src);
    tok = new LowerCaseFilter(tok);
    tok = new StopFilter(tok, StandardAnalyzer.STOP_WORDS_SET);
    tok = new PorterStemFilter(tok);
    return new TokenStreamComponents(src, tok);
  }
}