Elasticsearch:如何计算docFreq

时间:2019-07-14 03:47:33

标签: elasticsearch lucene

我试图了解docFreq的计算方式。是每个索引,每个字段的每个映射吗?

在将explain设置为true时,我从查询中得到了这些结果。 当点击位于映射ListedName.standard 时,docFreq较低,如下所示

 {
              "value" : 16.316673,
              "description" : """weight(ListedName.standard:"eagle pointe" in 48) [PerFieldSimilarity], result of:""",
              "details" : [
                {
                  "value" : 16.316673,
                  "description" : "score(doc=48,freq=1.0 = phraseFreq=1.0\n), product of:",
                  "details" : [
                    {
                      "value" : 3.0,
                      "description" : "boost",
                      "details" : [ ]
                    },
                    {
                      "value" : 5.4388914,
                      "description" : "idf(), sum of:",
                      "details" : [
                        {
                          "value" : 1.7870536,
                          "description" : "idf, computed as log(1 + (docCount - docFreq + 0.5) / (docFreq + 0.5)) from:",
                          "details" : [
                            {
                              "value" : 35.0,
                              "description" : "docFreq",
                              "details" : [ ]
                            },
                            {
                              "value" : 211.0,
                              "description" : "docCount",
                              "details" : [ ]
                            }
                          ]
                        },
                        {
                          "value" : 3.651838,
                          "description" : "idf, computed as log(1 + (docCount - docFreq + 0.5) / (docFreq + 0.5)) from:",
                          "details" : [
                            {
                              "value" : 5.0,
                              "description" : "docFreq",
                              "details" : [ ]
                            },
                            {
                              "value" : 211.0,
                              "description" : "docCount",
                              "details" : [ ]
                            }
                          ]
                        }
                      ]
                    },
                    {
                      "value" : 1.0,
                      "description" : "tfNorm, computed as (freq * (k1 + 1)) / (freq + k1) from:",
                      "details" : [
                        {
                          "value" : 1.0,
                          "description" : "phraseFreq=1.0",
                          "details" : [ ]
                        },
                        {
                          "value" : 0.0,
                          "description" : "parameter k1",
                          "details" : [ ]
                        },
                        {
                          "value" : 0.0,
                          "description" : "parameter b (norms omitted for field)",
                          "details" : [ ]
                        }
                      ]
                    }
                  ]
                }
              ]
            },

而当点击位于映射第1行 docFreq中时,点击率很高,如下所示

  {
              "value" : 1.1640041,
              "description" : """weight(Line1:"eagle pointe" in 148) [PerFieldSimilarity], result of:""",
              "details" : [
                {
                  "value" : 1.1640041,
                  "description" : "score(doc=148,freq=1.0 = phraseFreq=1.0\n), product of:",
                  "details" : [
                    {
                      "value" : 3.0,
                      "description" : "boost",
                      "details" : [ ]
                    },
                    {
                      "value" : 0.38800138,
                      "description" : "idf(), sum of:",
                      "details" : [
                        {
                          "value" : 0.18813552,
                          "description" : "idf, computed as log(1 + (docCount - docFreq + 0.5) / (docFreq + 0.5)) from:",
                          "details" : [
                            {
                              "value" : 171.0,
                              "description" : "docFreq",
                              "details" : [ ]
                            },
                            {
                              "value" : 206.0,
                              "description" : "docCount",
                              "details" : [ ]
                            }
                          ]
                        },
                        {
                          "value" : 0.19986586,
                          "description" : "idf, computed as log(1 + (docCount - docFreq + 0.5) / (docFreq + 0.5)) from:",
                          "details" : [
                            {
                              "value" : 169.0,
                              "description" : "docFreq",
                              "details" : [ ]
                            },
                            {
                              "value" : 206.0,
                              "description" : "docCount",
                              "details" : [ ]
                            }
                          ]
                        }
                      ]
                    },
                    {
                      "value" : 1.0,
                      "description" : "tfNorm, computed as (freq * (k1 + 1)) / (freq + k1) from:",
                      "details" : [
                        {
                          "value" : 1.0,
                          "description" : "phraseFreq=1.0",
                          "details" : [ ]
                        },
                        {
                          "value" : 0.0,
                          "description" : "parameter k1",
                          "details" : [ ]
                        },
                        {
                          "value" : 0.0,
                          "description" : "parameter b (norms omitted for field)",
                          "details" : [ ]
                        }
                      ]
                    }
                  ]
                }
              ]
            }

0 个答案:

没有答案