在Lucene(JAVA)中多次查询文件的次数

时间:2015-12-31 17:06:16

标签: java lucene

我正在使用java代码实现lucene。我搜索过一个短语,例如使用Index目录中的ShingleFilter (TokenStream input, int minShingleSize, int maxShingleSize)进行“软件工程,软件开发”。它运作良好。输出是:

Phrase Searching:software engineering software
Found 5 hits.
1. Index Document ID:336 File Name: jucs_243.pdf.txt
2. Index Document ID:506 File Name: jucs_4.pdf.txt
3. Index Document ID:524 File Name: jucs_419.pdf.txt
4. Index Document ID:276 File Name: jucs_189.pdf.txt
5. Index Document ID:340 File Name: jucs_247.pdf.txt

Phrase Searching:software engineering software development
Found 1 hits.
1. Index Document ID:506 File Name: jucs_4.pdf.txt
Phrase Searching:engineering software development
Found 1 hits.
1. Index Document ID:506 File Name: jucs_4.pdf.txt

我的问题是:单个文件在Java中出现多少次?我的代码是:

// display search results

TopDocs topDocs = searcher.search(query, LuceneConstants.MAX_SEARCH);

ScoreDoc[] hits = topDocs.scoreDocs;

System.out.println("Found " + hits.length + " hits.");

for(int i=0;i<hits.length;++i) { 
      int docId = hits[i].doc;  
     // print some info about where the hit was found...  
      Document d = searcher.doc(docId);  
      System.out.println((i + 1) + ". " +"Index Document ID:"+ docId + "File Name:" + d.get(LuceneConstants.FILE_PATH));  

}

1 个答案:

答案 0 :(得分:0)

我已经使用

完成了这项任务
    static TreeMap<Integer, Integer> Total_Hits = new TreeMap<Integer, Integer>();

我的代码是:

     for(int i=0;i<hits.length;++i){ 
      int docId = hits[i].doc; 
      if(Total_Hits.keySet().isEmpty() == true)
          {
            Total_Hits.put(docId,1);
          }
      else 
         {
            if(Total_Hits.containsKey(docId))
                {Total_Hits.put(docId,Total_Hits.get(docId).intValue()+1);}
            else
                { Total_Hits.put(docId,1);}

         }

输出:

Document ID:276  No oF Hits : 1  Time
Document ID:336  No oF Hits : 1  Time
Document ID:340  No oF Hits : 1  Time
Document ID:506  No oF Hits : 3  Time
Document ID:524  No oF Hits : 1  Time