Question

如果我从Lucene Java Doc Page正确理解，将CustomScoreQuery实例设置为strict应该将FunctionQuery的{{1}}值传递给{FieldSource而不进行修改（例如规范化）方法valSrcScore中的CustomScoreProvider的1}}。因此，我想，我得到的确是浮动值，它存储在文档的public float customScore(int doc, float subQueryScore, float valSrcScore)内。

但是当索引数据量变大时，情况似乎并非如此。在这里，我有一个显示我的意思的最小例子：

FloatSourceField

如此示例中所述，抛出异常是因为import org.apache.lucene.analysis.standard.StandardAnalyzer; import org.apache.lucene.document.*; import org.apache.lucene.index.*; import org.apache.lucene.queries.*; import org.apache.lucene.queries.function.FunctionQuery; import org.apache.lucene.queries.function.valuesource.FloatFieldSource; import org.apache.lucene.search.*; import org.apache.lucene.store.RAMDirectory; import org.apache.lucene.util.Version; import java.io.IOException; public class CustomScoreTest { public static void main(String[] args) throws IOException { RAMDirectory index = new RAMDirectory(); IndexWriterConfig config = new IndexWriterConfig(Version.LATEST, new StandardAnalyzer()); IndexWriter writer = new IndexWriter(index, config); // prepare dummy text String text = ""; for (int i = 0; i < 1000; i++) text += "abc "; // add dummy docs for (int i = 0; i <25000; i++) { Document doc = new Document(); doc.add(new FloatField("number", i * 100f, Field.Store.YES)); doc.add(new TextField("text", text, Field.Store.YES)); writer.addDocument(doc); } writer.close(); IndexReader reader = IndexReader.open(index); IndexSearcher searcher = new IndexSearcher(reader); Query q1 = new TermQuery(new Term("text", "abc")); CustomScoreQuery q2 = new CustomScoreQuery(q1, new FunctionQuery(new FloatFieldSource("number"))) { protected CustomScoreProvider getCustomScoreProvider(AtomicReaderContext ctx) throws IOException { return new CustomScoreProvider(ctx) { public float customScore(int doc, float subQueryScore, float valSrcScore) throws IOException { float diff = Math.abs(valSrcScore - searcher.doc(doc).getField("number").numericValue().floatValue()); if (diff > 0) throw new IllegalStateException("diff: " + diff); return super.customScore(doc, subQueryScore, valSrcScore); } }; } }; // In strict custom scoring, the part does not participate in weight normalization. // This may be useful when one wants full control over how scores are modified, and // does not care about normalising by the part q2.setStrict(true); // Exception in thread "main" java.lang.IllegalStateException: diff: 1490700.0 searcher.search(q2, 10); } }与文档“number”字段中存储的实际值有很大不同。

但是当我将索引的虚拟文档的数量减少到2500时，它会按预期工作，并且我得到的值与“数字”字段中的值的差值为0。

我在这里做错了什么？

Answer 1

你正在运行哪个版本的lucene？一种可能性是，随着您的索引大小的增长，AtomicReaderContext应该替换为LeafReaderContext。只是一个假设

Lucene CustomScoreQuery不传递来自FunctionQuery的FieldSource的值

1 个答案: