Lucene CustomScoreQuery不传递来自FunctionQuery的FieldSource的值

时间:2015-02-01 13:48:06

标签: java lucene

如果我从Lucene Java Doc Page正确理解,将CustomScoreQuery实例设置为strict应该将FunctionQuery的{​​{1}}值传递给{FieldSource而不进行修改(例如规范化)方法valSrcScore中的CustomScoreProvider的1}}。 因此,我想,我得到的确是浮动值,它存储在文档的public float customScore(int doc, float subQueryScore, float valSrcScore)内。

但是当索引数据量变大时,情况似乎并非如此。在这里,我有一个显示我的意思的最小例子:

FloatSourceField

如此示例中所述,抛出异常是因为import org.apache.lucene.analysis.standard.StandardAnalyzer; import org.apache.lucene.document.*; import org.apache.lucene.index.*; import org.apache.lucene.queries.*; import org.apache.lucene.queries.function.FunctionQuery; import org.apache.lucene.queries.function.valuesource.FloatFieldSource; import org.apache.lucene.search.*; import org.apache.lucene.store.RAMDirectory; import org.apache.lucene.util.Version; import java.io.IOException; public class CustomScoreTest { public static void main(String[] args) throws IOException { RAMDirectory index = new RAMDirectory(); IndexWriterConfig config = new IndexWriterConfig(Version.LATEST, new StandardAnalyzer()); IndexWriter writer = new IndexWriter(index, config); // prepare dummy text String text = ""; for (int i = 0; i < 1000; i++) text += "abc "; // add dummy docs for (int i = 0; i <25000; i++) { Document doc = new Document(); doc.add(new FloatField("number", i * 100f, Field.Store.YES)); doc.add(new TextField("text", text, Field.Store.YES)); writer.addDocument(doc); } writer.close(); IndexReader reader = IndexReader.open(index); IndexSearcher searcher = new IndexSearcher(reader); Query q1 = new TermQuery(new Term("text", "abc")); CustomScoreQuery q2 = new CustomScoreQuery(q1, new FunctionQuery(new FloatFieldSource("number"))) { protected CustomScoreProvider getCustomScoreProvider(AtomicReaderContext ctx) throws IOException { return new CustomScoreProvider(ctx) { public float customScore(int doc, float subQueryScore, float valSrcScore) throws IOException { float diff = Math.abs(valSrcScore - searcher.doc(doc).getField("number").numericValue().floatValue()); if (diff > 0) throw new IllegalStateException("diff: " + diff); return super.customScore(doc, subQueryScore, valSrcScore); } }; } }; // In strict custom scoring, the part does not participate in weight normalization. // This may be useful when one wants full control over how scores are modified, and // does not care about normalising by the part q2.setStrict(true); // Exception in thread "main" java.lang.IllegalStateException: diff: 1490700.0 searcher.search(q2, 10); } } 与文档“number”字段中存储的实际值有很大不同。

但是当我将索引的虚拟文档的数量减少到2500时,它会按预期工作,并且我得到的值与“数字”字段中的值的差值为0。

我在这里做错了什么?

1 个答案:

答案 0 :(得分:0)

你正在运行哪个版本的lucene?一种可能性是,随着您的索引大小的增长,AtomicReaderContext应该替换为LeafReaderContext。只是一个假设