我正在尝试在Lucene.net 4.8中创建一个自定义分析器 - 但是我遇到了一个我无法理解的错误。
我的分析器代码:
public class SynonymAnalyzer : Analyzer
{
protected override TokenStreamComponents CreateComponents(string fieldName, TextReader reader)
{
String base1 = "lawnmower";
String syn1 = "lawn mower";
String base2 = "spanner";
String syn2 = "wrench";
SynonymMap.Builder sb = new SynonymMap.Builder(true);
sb.Add(new CharsRef(base1), new CharsRef(syn1), true);
sb.Add(new CharsRef(base2), new CharsRef(syn2), true);
SynonymMap smap = sb.Build();
Tokenizer tokenizer = new StandardTokenizer(Version.LUCENE_48, reader);
TokenStream result = new StandardTokenizer(Version.LUCENE_48, reader);
result = new SynonymFilter(result, smap, true);
return new TokenStreamComponents(tokenizer, result);
}
}
构建索引的代码是:
var fordFiesta = new Document();
fordFiesta.Add(new StringField("Id", "1", Field.Store.YES));
fordFiesta.Add(new TextField("Make", "Ford", Field.Store.YES));
fordFiesta.Add(new TextField("Model", "Fiesta 1.0 Developing", Field.Store.YES));
fordFiesta.Add(new TextField("FullText", "lawnmower Ford 1.0 Fiesta Developing spanner", Field.Store.YES));
Lucene.Net.Store.Directory directory = FSDirectory.Open(new DirectoryInfo(Environment.CurrentDirectory + "\\LuceneIndex"));
SynonymAnalyzer analyzer = new SynonymAnalyzer();
var config = new IndexWriterConfig(Version.LUCENE_48, analyzer);
var writer = new IndexWriter(directory, config);
writer.UpdateDocument(new Term("Id", "1"), fordFiesta);
writer.Flush(true, true);
writer.Commit();
writer.Dispose();
然而,当我运行我的代码时,它在writer.UpdateDocument行失败并出现以下错误:
TokenStream合同违规:Reset()/ Dispose()调用缺失,Reset()多次调用,或者子类不调用base.Reset()。有关正确的使用工作流程的更多信息,请参阅TokenStream类的Javadocs。
我无法弄清楚我哪里出错了?!
答案 0 :(得分:2)
问题是你的TokenStreamComponents是使用与结果TokenStream中使用的Tokenizer不同的Tokenizer构造的。将其更改为此应解决问题:
Tokenizer tokenizer = new StandardTokenizer(Version.LUCENE_48, reader);
TokenStream result = new SynonymFilter(tokenizer, smap, true);
return new TokenStreamComponents(tokenizer, result);