使用Solr,什么是“添加增强”而不是使用“最大”增强的正确方法

时间:2018-11-17 02:19:50

标签: solr solrnet

使用调试查询功能并查看“说明”部分,我意识到我一直在使用的增强功能:https://stackoverflow.com/a/7701758/7096114根据与每个字段匹配的查询结果,使用“ max of”比较。在我的系统中,我有10个字段,这些字段根据某些值进行了增强。然后,我按照分数从高到低的顺序对结果进行排序,但是我认为该分数将基于为其匹配的任何字段(总计)所获得的分数。我没有意识到分数被设置为针对任何增强字段计算出的最高分数。如果我想优先考虑匹配我所有10个字段的结果,并且总得分(例如500)要高于单个结果,而不是仅匹配我1个字段(例如100)的结果,我就是不太确定如何处理。

示例说明:

    320.3237 = sum of:
  0.0069028055 = weight(custom_app:test in 7918) [SchemaSimilarity], result of:
    0.0069028055 = score(doc=7918,freq=1.0 = termFreq=1.0
), product of:
      0.006641347 = idf(docFreq=48698, docCount=49022)
      1.0393683 = tfNorm, computed from:
        1.0 = termFreq=1.0
        1.2 = parameter k1
        0.75 = parameter b
        1.1020359 = avgFieldLength
        1.0 = fieldLength
  320.3168 = max of:
    73.23891 = weight(name_autocomplete:james in 7918) [SchemaSimilarity], result of:
      73.23891 = score(doc=7918,freq=1.0 = termFreq=1.0
), product of:
        6.066 = boost
        7.8911004 = idf(docFreq=32, docCount=86884)
        1.5300368 = tfNorm, computed from:
          1.0 = termFreq=1.0
          1.2 = parameter k1
          0.75 = parameter b
          6.527704 = avgFieldLength
          1.0 = fieldLength
    51.871056 = weight(name_partial_match:colin in 7918) [SchemaSimilarity], result of:
      51.871056 = score(doc=7918,freq=1.0 = termFreq=1.0
), product of:
        4.05 = boost
        7.8603234 = idf(docFreq=33, docCount=86843)
        1.6294072 = tfNorm, computed from:
          1.0 = termFreq=1.0
          1.2 = parameter k1
          0.75 = parameter b
          17.933905 = avgFieldLength
          1.0 = fieldLength
    9.736896 = weight(custom_name_phonetic_en:KLN in 7918) [SchemaSimilarity], result of:
      9.736896 = score(doc=7918,freq=1.0 = termFreq=1.0
), product of:
        1.6875 = boost
        5.4820786 = idf(docFreq=361, docCount=86884)
        1.0525228 = tfNorm, computed from:
          1.0 = termFreq=1.0
          1.2 = parameter k1
          0.75 = parameter b
          2.9156578 = avgFieldLength
          2.56 = fieldLength
    61.69854 = weight(custom_display_name_partial_match:colin in 7918) [SchemaSimilarity], result of:
      61.69854 = score(doc=7918,freq=1.0 = termFreq=1.0
), product of:
        5.0625 = boost
        7.532877 = idf(docFreq=46, docCount=86883)
        1.61789 = tfNorm, computed from:
          1.0 = termFreq=1.0
          1.2 = parameter k1
          0.75 = parameter b
          38.531185 = avgFieldLength
          2.56 = fieldLength
    86.66015 = weight(custom_name_autocomplete:colin in 7918) [SchemaSimilarity], result of:
      86.66015 = score(doc=7918,freq=1.0 = termFreq=1.0
), product of:
        7.5825 = boost
        7.6228366 = idf(docFreq=42, docCount=86884)
        1.4993064 = tfNorm, computed from:
          1.0 = termFreq=1.0
          1.2 = parameter k1
          0.75 = parameter b
          13.767955 = avgFieldLength
          2.56 = fieldLength
    9.267912 = weight(name_phonetic_en:KLN in 7918) [SchemaSimilarity], result of:
      9.267912 = score(doc=7918,freq=1.0 = termFreq=1.0
), product of:
        1.35 = boost
        6.1070633 = idf(docFreq=193, docCount=86884)
        1.1241279 = tfNorm, computed from:
          1.0 = termFreq=1.0
          1.2 = parameter k1
          0.75 = parameter b
          1.3697113 = avgFieldLength
          1.0 = fieldLength
    320.3168 = weight(name_lowercase:colin in 7918) [SchemaSimilarity], result of:
      320.3168 = score(doc=7918,freq=1.0 = termFreq=1.0
), product of:
        40.1 = boost
        7.9879503 = idf(docFreq=29, docCount=86884)
        1.0 = tfNorm, computed from:
          1.0 = termFreq=1.0
          1.2 = parameter k1
          0.75 = parameter b
          1.0 = avgFieldLength
          1.0 = fieldLength

1 个答案:

答案 0 :(得分:1)

如果您要包括其他分数的一部分-除了最高得分查询-you can use the tie parameter

此参数告诉Solr其他 other 字段的得分中有多少也产生了命中数,以计入结局得分。通常是一个较低的值,例如0.1

  

tie参数指定一个浮点值(该值应小于1),以用作DisMax查询中的决胜局。

     

当针对多个字段测试用户输入的术语时,可能会匹配多个字段。如果是这样,则每个字段将根据该单词在该字段中的普遍程度(对于每个文档相对于所有其他文档)生成不同的分数。 tie参数使您可以控制与最高得分字段相比,较低得分字段的得分对查询最终得分的影响程度。

     

值“ 0.0”(默认值)使查询成为纯粹的“析取最大查询”:也就是说,只有最大得分子查询才有助于最终得分。值“ 1.0”使查询成为纯“析取和查询”,在该查询中最高得分子查询是什么,因为最终得分将是子查询得分的总和。通常,较低的值(例如0.1)很有用。