我正在使用Lucene 3.6,我希望在搜索时将来在文档的某个字段中得到每个术语的分数。为了存储索引,我创建了这样的文档:
Document doc = new Document();
doc.add(new Field("description", entry.getDescription(), Field.Store.NO, Field.Index.ANALYZED, Field.TermVector.WITH_POSITIONS_OFFSETS));
writer.addDocument(doc);
writer.close(true);
例如,文档有一个“足球”术语:
...
1.623904 = (MATCH) fieldWeight(description:football in 1775), product of:
1.0 = tf(termFreq(description:football )=1)
8.660821 = idf(docFreq=5, maxDocs=12741)
0.1875 = fieldNorm(field=description, doc=1775)
...
我正在使用此代码获取tf
和idf
:
TermFreqVector freqV = indexReader.getTermFreqVector(docId, "description");
for (int j = 0; j < freqV.getTerms().length; j++) {
String term = freqV.getTerms()[j];
int freq = freqV.getTermFrequencies()[j];
float idf = similarity.idfExplain(new Term("descpription", term), searcher).getIdf();
}
但我无法理解如何在搜索时获得fieldNorm
。
任何人都可以帮忙解决这个问题吗?
感谢。