我在互联网上找到了这个例子:
Indexer.java
public class Indexer {
private IndexWriter writer;
@SuppressWarnings("deprecation")
public Indexer(String indexDirectoryPath) throws IOException {
Directory indexDirectory = FSDirectory.open(new File(indexDirectoryPath));
writer = new IndexWriter(indexDirectory, new StandardAnalyzer(Version.LUCENE_36), true,
IndexWriter.MaxFieldLength.UNLIMITED);
}
public void close() throws CorruptIndexException, IOException {
writer.close();
}
private Document getDocument(File file) throws IOException {
Document document = new Document();
Field contentField = new Field(LuceneConstants.CONTENTS, new FileReader(file));
Field fileNameField = new Field(LuceneConstants.FILE_NAME, file.getName(), Field.Store.YES,
Field.Index.NOT_ANALYZED);
Field filePathField = new Field(LuceneConstants.FILE_PATH, file.getCanonicalPath(), Field.Store.YES,
Field.Index.NOT_ANALYZED);
document.add(contentField);
document.add(fileNameField);
document.add(filePathField);
return document;
}
public void indexFile(File file) throws IOException {
Document document = getDocument(file);
writer.addDocument(document);
}
public int createIndex(String file) throws IOException {
indexFile(new File(file));
return writer.numDocs();
}
}
Searcher.java
public class Searcher {
IndexSearcher indexSearcher;
QueryParser queryParser;
Query query;
@SuppressWarnings("deprecation")
public Searcher(String indexDirectoryPath) throws IOException {
Directory indexDirectory = FSDirectory
.open(new File(indexDirectoryPath));
indexSearcher = new IndexSearcher(indexDirectory);
queryParser = new QueryParser(Version.LUCENE_36,
LuceneConstants.CONTENTS, new StandardAnalyzer(
Version.LUCENE_36));
}
public TopDocs search(String searchQuery) throws IOException,
ParseException {
query = queryParser.parse(QueryParser.escape(searchQuery));
return indexSearcher.search(query, LuceneConstants.MAX_SEARCH);
}
public Document getDocument(ScoreDoc scoreDoc)
throws CorruptIndexException, IOException {
return indexSearcher.doc(scoreDoc.doc);
}
public void close() throws IOException {
indexSearcher.close();
}
}
LuceneConstants.java
public class LuceneConstants {
public static final String CONTENTS = "contents";
public static final String FILE_NAME = "filename";
public static final String FILE_PATH = "filepath";
public static final int MAX_SEARCH = 10;
}
这就是我使用它们的方式:
public static void main(String[] args) throws IOException, ParseException {
{
// First file
Indexer indexer = new Indexer("index");
indexer.createIndex("f1.txt");
indexer.close();
Searcher searcher = new Searcher(Constante.DIR_INDEX.getValor());
TopDocs hits = searcher.search("Art. 1°");
for (ScoreDoc scoreDoc : hits.scoreDocs) {
org.apache.lucene.document.Document doc = searcher.getDocument(scoreDoc);
String nomeArquivo = doc.get(LuceneConstants.FILE_PATH);
System.out.println(nomeArquivo);
}
}
System.out.println("-----");
{
// Second file
Indexer indexer = new Indexer("index");
indexer.createIndex("f2.txt");
indexer.close();
Searcher searcher = new Searcher(Constante.DIR_INDEX.getValor());
TopDocs hits = searcher.search("Art. 1°");
for (ScoreDoc scoreDoc : hits.scoreDocs) {
org.apache.lucene.document.Document doc = searcher.getDocument(scoreDoc);
String nomeArquivo = doc.get(LuceneConstants.FILE_PATH);
System.out.println(nomeArquivo);
}
}
}
直到“// second file”行才能正常工作。
在我索引第二个文件后,我无法在第一个文件中找到任何内容。
如果我创建一个Indexer实例并使用它同一个实例索引f1.txt和f2.txt并关闭它然后它就像我想要的那样工作。问题是,如果我关闭我的应用程序并打开它并决定索引另一个文件,我将丢失f1.txt和f2.txt。
有没有办法让Lucene在索引新文件时始终保留上一个索引?
答案 0 :(得分:1)
看起来您使用的是旧版Lucene(3.6或更低版本),对吗?
IndexWriter constructor的第三个参数指定它是应该创建新索引还是打开现有索引。如果设置为true
,它将覆盖现有索引(如果给定目录中存在索引)。如果要打开现有索引而不覆盖它,则应为false
:
writer = new IndexWriter(indexDirectory, new StandardAnalyzer(Version.LUCENE_36), false, IndexWriter.MaxFieldLength.UNLIMITED);