我正在尝试从Lucene目录中检索唯一术语列表。在索引编制期间,我创建了一个具有TermVector类型的字段:
private void addDoc(IndexWriter writer, String content, String title, String id) throws IOException {
Document doc = new Document();
doc.add(new TextField("content", content, Field.Store.YES));
doc.add(new TextField("title", title, Field.Store.YES));
doc.add(new StringField("id", id, Field.Store.YES));
FieldType type = new FieldType();
type.setStored(true);
type.setStoreTermVectors(true);
IndexOptions options = IndexOptions.DOCS_AND_FREQS;
type.setIndexOptions(options);
Field field = new Field("termVector", content, type);
doc.add(field);
writer.addDocument(doc);
}
这适用于索引,但是当我尝试使用TermsEnum检索术语时,我没有得到任何结果:
private void buildVocabularyLucene(Directory directory) throws IOException {
DirectoryReader reader = DirectoryReader.open(directory);
Fields fields = MultiFields.getFields(reader);
for(String field : fields) {
if(!field.equals("termVector")) {
continue;
}
Terms terms = fields.terms(field);
TermsEnum termsEnum = terms.iterator();
BytesRef text = termsEnum.next();
while((text) != null) {
System.out.println(text.utf8ToString());
}
}
}
有没有人知道为什么这不是检索文本?我正在使用Lucene 6.5.1版。