Question

我目前有一个包含2个文档的索引（在一切正常后会添加更多）。我尝试calculate the df一个特定的术语，但我总是得到索引中的文档总数作为结果。为了调试目的我在一个文档中输入了一个唯一的字符串，所以df的结果应该是 1 即可。但是，它返回 2 。在流程结束时，我需要为索引中的每个单词提供tf/idf分数。

我尝试过以下代码：

public void calcDF (String term) throws IOException
{

     //open the index file
    Directory dir = FSDirectory.open(new File("d:/index"));
    //create a reader
    IndexReader ir = IndexReader.open(dir);
    //for debug
    System.out.println("num of docs in index : " + ir.maxDoc()); 
    Term t = new Term("content",term);
    int df = ir.docFreq(t);
}

还尝试使用IndexSearcher searcher = new IndexSearcher(ir);代替indexReader，但没有运气。

P.S：我正在使用lucene 3.5

使用Lucene计算DF不起作用

0 个答案: