在位上应用Lucene查询

时间:2016-12-03 15:19:16

标签: solr lucene

如何在给定org.apache.lucene.search.Query个对象上应用org.apache.lucene.util.Bits

背景:我有一个org.apache.lucene.index.FilterLeafReader的子类,我希望通过在" Bits"上应用查询来过滤liveocs。

根据javadoc,如果我覆盖numDocs(),我还需要覆盖getLiveDocs()。所以问题还扩展到如何根据查询过滤文档数量(在FilterLeafReader内)

1 个答案:

答案 0 :(得分:1)

我最终得到了这个解决方案(在询问lucene邮件列表上的问题后):

final IndexSearcher searcher = new IndexSearcher(reader);
searcher.setQueryCache(null);
final boolean needsScores = false; // scores are not needed, only matching docs
final Weight preserveWeight = searcher.createNormalizedWeight(preserveFilter, needsScores);
final int maxDoc = in.maxDoc();
final FixedBitSet bits = new FixedBitSet(maxDoc);
// ignore livedocs here, as we filter them later:
final Scorer preverveScorer = preserveWeight.scorer(context);
if (preverveScorer != null) {
  bits.or(preverveScorer.iterator());
}
if (negateFilter) {
  bits.flip(0, maxDoc);
}

if (in.hasDeletions()) {
  final Bits oldLiveDocs = in.getLiveDocs();
  assert oldLiveDocs != null;
  final DocIdSetIterator it = new BitSetIterator(bits, 0L); // the cost is not useful here
  for (int i = it.nextDoc(); i != DocIdSetIterator.NO_MORE_DOCS; i = it.nextDoc()) {
    if (!oldLiveDocs.get(i)) {
      // we can safely modify the current bit, as the iterator already stepped over it:
      bits.clear(i);
    }
 }
}

this.liveDocs = bits;
this.numDocs = bits.cardinality();

https://github.com/apache/lucene-solr/blob/master/lucene/misc/src/java/org/apache/lucene/index/PKIndexSplitter.java#L127-L170