对于Lucene 3.6.2
我有以下分析器:
public final class StandardAnalyzerV36 extends Analyzer {
private Analyzer analyzer;
public StandardAnalyzerV36() {
analyzer = new StandardAnalyzer(Version.LUCENE_36);
}
public StandardAnalyzerV36(Set<?> stopWords) {
analyzer = new StandardAnalyzer(Version.LUCENE_36, stopWords);
}
@Override
public final TokenStream tokenStream(String fieldName, Reader reader) {
return analyzer.tokenStream(fieldName, new HTMLStripCharFilter(CharReader.get(reader)));
}
@Override
public final TokenStream reusableTokenStream(String fieldName, Reader reader) throws IOException {
return analyzer.reusableTokenStream(fieldName, reader);
}
}
你能帮我把它移到Analyzer Lucene 5.5.0
上吗? Analyzer版本在新版本中已更改。
已更新
我已将此分析器重新实现为以下内容:
public final class StandardAnalyzerV36 extends Analyzer {
public static final CharArraySet STOP_WORDS_SET = StopAnalyzer.ENGLISH_STOP_WORDS_SET;
@Override
protected TokenStreamComponents createComponents(String fieldName) {
final ClassicTokenizer src = new ClassicTokenizer();
TokenStream tok = new StandardFilter(src);
tok = new StopFilter(new LowerCaseFilter(tok), STOP_WORDS_SET);
return new TokenStreamComponents(src, tok);
}
@Override
protected Reader initReader(String fieldName, Reader reader) {
return new HTMLStripCharFilter(reader);
}
但是我的测试在接下来的电话中失败了:
tokens = LuceneUtils.tokenizeString(analyzer, "[{(RDBMS)}]");
public static List<String> tokenizeString(Analyzer analyzer, String string) {
List<String> result = new ArrayList<String>();
try {
TokenStream stream = analyzer.tokenStream(null, new StringReader(string));
stream.reset();
while (stream.incrementToken()) {
result.add(stream.getAttribute(CharTermAttribute.class).toString());
}
} catch (IOException e) {
// not thrown b/c we're using a string reader...
throw new RuntimeException(e);
}
return result;
}
有以下例外:
java.lang.IllegalStateException: TokenStream contract violation: close() call missing
at org.apache.lucene.analysis.Tokenizer.setReader(Tokenizer.java:90)
at org.apache.lucene.analysis.Analyzer$TokenStreamComponents.setReader(Analyzer.java:315)
at org.apache.lucene.analysis.Analyzer.tokenStream(Analyzer.java:143)
此代码有什么问题?
答案 0 :(得分:0)
最后我开始工作了:
Prelude>