为什么我在es中得到了奇怪的规范值?

时间:2015-09-21 23:44:20

标签: elasticsearch lucene

我得到了解释结果:

{
    "_index": "scoretest",
    "_type": "test",
    "_id": "2",
    "matched": true,
    "explanation": {
        "value": 0.8784157,
        "description": "weight(content:chinese in 1) [PerFieldSimilarity], result of:",
        "details": [
            {
                "value": 0.8784157,
                "description": "fieldWeight in 1, product of:",
                "details": [
                    {
                        "value": 1,
                        "description": "tf(freq=1.0), with freq of:",
                        "details": [
                            {
                                "value": 1,
                                "description": "termFreq=1.0"
                            }
                        ]
                    },
                    {
                        "value": 1.4054651,
                        "description": "idf(docFreq=1, maxDocs=3)"
                    },
                    {
                        "value": 0.625,
                        "description": "fieldNorm(doc=1)"
                    }
                ]
            }
        ]
    }
}

我的文件是:

chinese book

注意fieldNorm是0.625

但请遵循:

public float lengthNorm(FieldInvertState state) {
    final int numTerms;
    if (discountOverlaps)
        numTerms = state.getLength() - state.getNumOverlap();
    else
        numTerms = state.getLength();
    return state.getBoost() * ((float) (1.0 / Math.sqrt(numTerms)));
}

我认为它应该是1 / sqrt(2)= 0.71428571428571

由于解码和编码,我找到了解释。

1但我不确定是不是因为解码?或我的计算错误?

公式中的

2,什么是state.getBoost()?

1 个答案:

答案 0 :(得分:1)

这是因为规范被编码为single byte floats。基本上,默认相似性的值会发生什么:

double d = 1.0 / Math.sqrt(2);
float f = (float)d;
byte b = SmallFloat.floatToByte315(f);
float bf = SmallFloat.byte315ToFloat(b);
System.out.println(bf);

如果您要运行此代码,则应该返回0.625

除非您在映射中为此字段设置索引时间提升,否则

setBoost()应为1.0。