定制Solr Stemming过滤器

时间:2016-06-27 08:40:27

标签: solr filter arabic stem

我正在尝试实现一个自定义的solr过滤器来阻止阿拉伯语单词,过滤器类如下,但我不断收到以下错误"可能的分析错误"索引文档时,我正在使用Khoja的词干分析器

public final class CustomArbicStemFilter extends TokenFilter {
private CharTermAttribute termAtt = addAttribute(CharTermAttribute.class);
private CustomArabicStemmer stemmer = null; 
public CustomArbicStemFilter(TokenStream input) {
super(input);
this.stemmer = new CustomArabicStemmer();
}
public final boolean incrementToken() throws IOException {
     if (input.incrementToken()) {      
         char termBuffer[] = termAtt.buffer();
         String currentWord = new String( termBuffer);
         String stemmedWord = stemmer.stemWord(currentWord);
         char finalTerm[] = stemmedWord.toCharArray();
         termAtt.copyBuffer(finalTerm, 0, finalTerm.length);      
         return true;
     }else{
         return false;
     }      

} }

0 个答案:

没有答案