我有一个像这样的lucene条目:
"心率加快"
当我遇到文字"增加心率"我想在索引中匹配此条目。这意味着我需要将输入标记为:
{increased, heart, rate}
{increasedheart, rate}
{increased, heartrate}
如何使用lucene 6 +?
亲切的问候
答案 0 :(得分:0)
以下是我做过的方式,请接受建议:
public class MyAnalyzer extends Analyzer {
public MyAnalyzer() {
super();
}
@Override
protected TokenStreamComponents createComponents(String fieldName) {
final Tokenizer src = new WhitespaceTokenizer();
TokenStream tok = new LowerCaseFilter(src);
tok = new HyphenatedWordsFilter(tok);
tok = getStopFilter(tok);
ShingleFilter filter = new ShingleFilter(tok, 2);
filter.setTokenSeparator("");
tok = filter;
return new TokenStreamComponents(src, tok) {
@Override
protected void setReader(final Reader reader) {
super.setReader(reader);
}
};
}
}
注意ShingleFilter,并使用令牌分隔符设置方法。