ElasticSearch - 对多值非唯一数字字段的平均聚合/排序

时间:2016-07-04 15:30:55

标签: elasticsearch

我正在尝试对称为“rating_average”的多值字段的平均值进行排序。在我给你的例子中,这个字段的值是[1,2,2]。我期望平均值为(1 + 2 + 2)/ 3 = 1.66666667。事实上,我平均得到1.5。

经过一些测试和分析扩展统计数据后,我发现这是因为平均值是针对所有非唯一项目计算的。因此统计运算符应用于集合[1,2]而不是[1,2,2]。我已经通过在我的查询中添加聚合部分来验证这一点,以便仔细检查排序块的平均值与统计数据聚合中的平均值相同。

示例文档如下:

{
  "_source": {
  "content_uri": "http://data.semint.co.uk/resource/testContent1",
  "rating_average": [
    "1",
    "2",
    "2"
  ],
  "forDesk": "http://data.semint.co.uk/resource/kMFMJd1rtKD"
}

我正在执行的查询如下:

{
  "from": 0,
  "size": 20,
  "aggs": {
  "rating_stats": {
    "extended_stats": {
        "field": "rating_average"
      }
    }
  },
  "query": {
    "filtered": {
      "filter": {
        "bool": {
          "must": [
            {
              "terms": {
                "mediaType": [
                  "http://data.semint.co.uk/resource/testMediaType3"
              ],
              "execution": "and"
              }
            }
          ]
        }
      }
    }
  },
  "fields": [ "content_uri", "rating_average"],
  "sort": [
    {
      "rating_average": {
        "order": "desc",
        "mode": "avg"
      }
    }
  ]
}

这些是我通过上述文档执行查询得到的结果。

{
  "took": 1,
  "timed_out": false,
  "_shards": {
    "total": 1,
    "successful": 1,
    "failed": 0
  },
  "hits": {
    "total": 1,
    "max_score": null,
    "hits": [
      {
        "_index": "travel_content6",
        "_type": "semantic-index",
        "_id": "http://data.semint.co.uk/resource/testContent1",
        "_score": null,
        "fields": {
          "content_uri": [
            "http://data.semint.co.uk/resource/testContent1"
          ],
          "rating_average": [1, 2, 2]
        },
        "sort": [
          1.5
        ]
      }
    ]
  },
  "aggregations": {
    "rating_stats": {
      "count": 2,
      "min": 1,
      "max": 2,
      "avg": 1.5,
      "sum": 3,
      "sum_of_squares": 5,
      "variance": 0.25,
      "std_deviation": 0.5,
      "std_deviation_bounds": {
        "upper": 2.5,
        "lower": 0.5
      }
    }
  }
}

0 个答案:

没有答案