搜索大量文档时,我可以轻松找到符合我搜索条件的文档数量:
Hits hits = Searcher.Search(query);
int DocumentCount = hits.Length();
如何确定文档中的总点击次数?例如,假设我搜索“congress”,我会收到2份文件。如何获得每个文档中“会议”的次数?例如,假设“会议”在文档#1中出现2次,在文档#2中出现3次。我正在寻找的结果是 5 。
答案 0 :(得分:6)
这是Lucene Java,但应该适用于Lucene.NET:
List docIds = // doc ids for documents that matched the query,
// sorted in ascending order
int totalFreq = 0;
TermDocs termDocs = reader.termDocs();
termDocs.seek(new Term("my_field", "congress"));
for (int id : docIds) {
termDocs.skipTo(id);
totalFreq += termDocs.freq();
}
答案 1 :(得分:0)
这也是Lucene Java。如果您的查询/搜索条件可以写为SpanQuery,那么您可以执行以下操作:
IndexReader indexReader = // define your index reader here
SpanQuery spanQuery = // define your span query here
Spans spans = spanQuery.getSpans(indexReader);
int occurrenceCount = 0;
while (spans.next()) {
occurrenceCount++;
}
// now occurrenceCount contains the total number of occurrences of the word/phrase/etc across all documents in the index