Lucene 5.4 - 得分除以搜索条件数量?

时间:2016-03-13 11:14:36

标签: solr lucene text-mining scoring multi-term

我使用IndexSearcher,QueryParser,SimpleAnalyzer进行了简单的设置。 运行一些查询我发现具有多个术语的查询返回的不同于ScoreDoc [i] .score,而不是解释查询语句中显示的。显然,它是解释中显示的分数除以搜索项的数量。对此行为的任何解释?

Running search(TERM1 TERM2 TERM3)
line:term1 line:term2 line:term3
2.167882 = sum of:
  0.6812867 = weight(line:term1 in 6594) [DefaultSimilarity], result of:
    0.6812867 = score(doc=6594,freq=2.0), product of:
      0.5389907 = queryWeigh

totalHits 1
1678413725, TERM1 TERM2 TERM3, score: 0.72262734

我理解coord()语句将用于惩罚仅包含所提供搜索词的子集的文档。但是,本文档包含所有条款。有什么建议吗?

编辑:似乎只有在查询配置为使用OR语句而不是AND时才会发生除法。因此,使用OR查询并匹配所有术语仍然除以搜索查询中的术语数。我无法在文档中找到这一部分,但至少它解释了差异。

然而,应用QueryWrapperFilter似乎再次改变得分。虽然根据文档,它应该只过滤结果而不影响评分。

更多详情

这两个分数是同一查询的结果。只有第二个查询被分割

0.114700586 = product of:
  0.34410176 = sum of:
    0.34410176 = weight(line:term1 in 24) [DefaultSimilarity], result of:
      0.34410176 = score(doc=24,freq=1.0), product of:
        0.5389907 = queryWeight, product of:
          8.17176 = idf(docFreq=14, maxDocs=19532)
          0.065957725 = queryNorm
        0.63841873 = fieldWeight in 24, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.17176 = idf(docFreq=14, maxDocs=19532)
          0.078125 = fieldNorm(doc=24)
  0.33333334 = coord(1/3)

item_id: 1495958818, item_name: term 1 dolor sit met, score: 0.114700586


0.18352094 = product of:
  0.5505628 = sum of:
    0.5505628 = weight(line:term 1 in 6112) [DefaultSimilarity], result of:
      0.5505628 = score(doc=6112,freq=1.0), product of:
        0.5389907 = queryWeight, product of:
          8.17176 = idf(docFreq=14, maxDocs=19532)
          0.065957725 = queryNorm
        1.02147 = fieldWeight in 6112, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.17176 = idf(docFreq=14, maxDocs=19532)
          0.125 = fieldNorm(doc=6112)
  0.33333334 = coord(1/3)

item_id: 1677761523, item_name: some text term 1, score: 0.061173648

0 个答案:

没有答案