我正在尝试开始使用lucene。我用来索引文档的代码是:
public void index(String type, String words) {
IndexWriter indexWriter = null;
try {
if (dir == null)
dir = createAndPropagate();
indexWriter = new IndexWriter(dir, new StandardAnalyzer(), true,
new KeepOnlyLastCommitDeletionPolicy(),
IndexWriter.MaxFieldLength.UNLIMITED);
Field wordsField = new Field(FIELD_WORDS, words, Field.Store.YES,
Field.Index.ANALYZED);
Field typeField = new Field(FIELD_TYPE, type, Field.Store.YES,
Field.Index.ANALYZED);
Document doc = new Document();
doc.add(wordsField);
doc.add(typeField);
indexWriter.addDocument(doc);
indexWriter.commit();
} catch (IOException e) {
logger.error("Problems while adding entry to index.", e);
} finally {
try {
if (indexWriter != null)
indexWriter.close();
} catch (IOException e) {
logger.error("Unable to close index writer.", e);
}
}
}
搜索结果如下:
public List<TagSearchEntity> searchFor(final String type, String words,
int amount) {
List<TagSearchEntity> result = new ArrayList<TagSearchEntity>();
try {
if (dir == null)
dir = createAndPropagate();
for (final Document doc : searchFor(dir, type, words, amount)) {
@SuppressWarnings("serial")
TagSearchEntity searchResult = new TagSearchEntity() {{
setType(type);
setWords(doc.getField(FIELD_WORDS).stringValue());
}};
result.add(searchResult);
}
} catch (IOException e) {
logger.error("Problems while searching", e);
}
return result;
}
private List<Document> searchFor(Directory indexDirectory, String type,
String words, int amount) throws IOException {
Searcher indexSearcher = new IndexSearcher(indexDirectory);
final Query tagQuery = new TermQuery(new Term(FIELD_WORDS, words));
final Query typeQuery = new TermQuery(new Term(FIELD_TYPE, type));
@SuppressWarnings("serial")
BooleanQuery query = new BooleanQuery() {{
add(tagQuery, BooleanClause.Occur.SHOULD);
add(typeQuery, BooleanClause.Occur.MUST);
}};
List<Document> result = new ArrayList<Document>();
for (ScoreDoc scoreDoc : indexSearcher.search(query, amount).scoreDocs) {
result.add(indexSearcher.doc(scoreDoc.doc));
}
indexSearcher.close();
return result;
}
我有两个用例。第一个添加某种类型的文档,然后搜索它,然后添加另一种类型的文档,然后搜索它,等等。另一个添加所有文档,然后搜索它们。第一个工作正常:
@Test
public void testSearch() {
search.index("type1", "test type1 for test purposes test test");
List<TagSearchEntity> result = search.searchFor("type1", "test", 10);
assertNotNull("Retrieved list should not be null.", result);
assertTrue("Retrieved list should not be empty.", !result.isEmpty());
search.index("type2", "test type2 for test purposes test test");
result.clear();
result = search.searchFor("type2", "test", 10);
assertTrue("Retrieved list should not be empty.", !result.isEmpty());
search.index("type3", "test type3 for test purposes test test");
result.clear();
result = search.searchFor("type3", "test", 10);
assertTrue("Retrieved list should not be empty.", !result.isEmpty());
}
但另一个似乎只是索引最后一个文件:
@Test
public void testBuggy() {
search.index("type1", "test type1 for test purposes test test");
search.index("type2", "test type2 for test purposes test test");
search.index("type3", "test type3 for test purposes test test");
List<TagSearchEntity> result = search.searchFor("type3", "test", 10);
assertNotNull("Retrieved list should not be null.", result);
assertTrue("Retrieved list should not be empty.", !result.isEmpty());
result.clear();
result = search.searchFor("type2", "test", 10);
assertTrue("Retrieved list should not be empty.", !result.isEmpty());
result.clear();
result = search.searchFor("type1", "test", 10);
assertTrue("Retrieved list should not be empty.", !result.isEmpty());
}
它成功找到type3
,但未能找到所有其他人。如果我解决这些调用,它仍然会成功找到最后一个索引文档。
Lucene版本,我正在使用的是:
<dependency>
<groupId>org.apache.lucene</groupId>
<artifactId>lucene-core</artifactId>
<version>2.4.1</version>
</dependency>
<dependency>
<groupId>lucene</groupId>
<artifactId>lucene</artifactId>
<version>1.4.3</version>
</dependency>
我做错了什么?如何使其索引所有文件?
答案 0 :(得分:2)
每次索引操作后都会创建一个新索引。第三个参数是create
标志,它被设置为true。根据{{3}},如果设置了此标志,它将创建新索引或覆盖现有索引。将其设置为false以附加到现有索引。