是否可以将Lucene索引中的所有术语列表提取为字符串列表?我在文档中找不到该功能。谢谢!
答案 0 :(得分:16)
在Lucene 4(和5)中:
Terms terms = SlowCompositeReaderWrapper.wrap(directoryReader).terms("field");
编辑:
这似乎是现在的'正确'方式(Lucene 6及以上):
LuceneDictionary ld = new LuceneDictionary( indexReader, "field" );
BytesRefIterator iterator = ld.getWordsIterator();
BytesRef byteRef = null;
while ( ( byteRef = iterator.next() ) != null )
{
String term = byteRef.utf8ToString();
}
答案 1 :(得分:10)
Lucene 3:
Java:
IndexReader indexReader = IndexReader.open(path);
TermEnum termEnum = indexReader.terms();
while (termEnum.next()) {
Term term = termEnum.term();
System.out.println(term.text());
}
termEnum.close();
indexReader.close();
Java(特定字段的所有术语):How can I get the list of unique terms from a specific field in Lucene?