字段长度值中的值非常大" fieldLength" Solr BM25

时间:2018-03-26 08:48:37

标签: solr lucene solr6

我在Solr 6中计算fieldLength值时遇到了一个问题。我使用BM25作为相似性度量。当我索引一组文档时,这些文档的fieldLength值非常错误。对于仅包含9个单词的标题字段,fieldLength字段存储值" 5.6493154E19"这是完全错误的。当我重新索引单个文档时,分数得到纠正,并显示fieldLength值为" 10.24"。 现在,当我重新索引整个语料库时,这些值再次被破坏,而且fieldLength值又是" 5.6493154E19"

存储原始字段长度值:

     4.641637E-19 = tfNorm, computed as (freq * (k1 + 1)) / (freq + k1 * (1 - b + b * fieldLength / avgFieldLength)) from:
    1.0 = termFreq=1.0
    1.2 = parameter k1
    0.75 = parameter b
    10.727212 = avgFieldLength
    5.6493154E19 = fieldLength

重新索引单个文档后:

     1.0189644 = tfNorm, computed as (freq * (k1 + 1)) / (freq + k1 * (1 - b + b * fieldLength / avgFieldLength)) from:
    1.0 = termFreq=1.0
    1.2 = parameter k1
    0.75 = parameter b
    10.72807 = avgFieldLength
    10.24 = fieldLength

重新索引整个语料库后:

      4.641637E-19 = tfNorm, computed as (freq * (k1 + 1)) / (freq + k1 * (1 - b + b * fieldLength / avgFieldLength)) from:
    1.0 = termFreq=1.0
    1.2 = parameter k1
    0.75 = parameter b
    10.727212 = avgFieldLength
    5.6493154E19 = fieldLength

关于问题所在的任何想法?

0 个答案:

没有答案