我在使用lucene荧光笔时遇到了一些麻烦。我有一个记录,在我的lucene索引中包含三个字段:标题,内容和分类(cls)。
当我使用“+(TITLE:test CONTENT:test)+ CLS:dummy”搜索索引时,我在标题匹配中找到“dummy”这个词,这是我的分类字段,我不想突出显示它。我该如何避免这种情况?
这是我的测试代码:
Directory directory = new RAMDirectory();
StandardAnalyzer analyzer = new StandardAnalyzer(Version.LUCENE_36);
IndexWriterConfig iwc = new IndexWriterConfig(Version.LUCENE_36, analyzer);
IndexWriter iw = new IndexWriter(directory, iwc);
//====================================write indexed======================================================
Document document = new Document();
Field _f_title = new Field("TITLE", "this is a test title - dummy",Field.Store.YES, Field.Index.ANALYZED);
Field _f_content = new Field("CONTENT", "this is a test content",Field.Store.YES, Field.Index.ANALYZED);
Field _f_cls = new Field("CLS", "dummy",Field.Store.YES, Field.Index.NOT_ANALYZED_NO_NORMS);
document.add(_f_title);
document.add(_f_content);
document.add(_f_cls);
iw.addDocument(document);
iw.close();
//====================================search indexed=====================================================
SimpleHTMLFormatter htmlFormatter = new SimpleHTMLFormatter("<span style='color:red;'>", "</span>");
SimpleFragmenter fragmenter = new SimpleFragmenter(100);
IndexReader ir = IndexReader.open(directory);
IndexSearcher is = new IndexSearcher(ir);
QueryParser parser = new QueryParser(Version.LUCENE_36, "", analyzer);
Query query = parser.parse("+(TITLE:test CONTENT:test) +CLS:dummy");
TopDocs docs = is.search(query, 10);
Highlighter highlighter = new Highlighter(htmlFormatter, new QueryScorer(query));
highlighter.setMaxDocCharsToAnalyze(Integer.MAX_VALUE);
highlighter.setTextFragmenter(fragmenter);
for(ScoreDoc cDoc : docs.scoreDocs) {
Document _document = is.doc(cDoc.doc);
System.out.println("title:" + highlighter.getBestFragment(analyzer, "TITLE", _document.get("TITLE")));
System.out.println("content:" + highlighter.getBestFragment(analyzer, "CONTENT", _document.get("CONTENT")));
}
is.close();
此程序输出:
title:this is a <span style='color:red;'>test</span> title - <span style='color:red;'>dummy</span>
content:this is a <span style='color:red;'>test</span> content
实际上,我希望是:
title:this is a <span style='color:red;'>test</span> title - dummy
content:this is a <span style='color:red;'>test</span> content
我该怎么办?