Elasticsearch仅聚合在数组中的特定条目上

时间:2017-06-14 12:40:24

标签: arrays elasticsearch aggregation

我是Elasticsearch的新手,无法弄清楚如何解决以下问题。 解释我的问题最简单的方法是向您展示一个例子。

以下数组“listing”是Elasticsearch中所有文件的一部分,但条目各不相同,因此“id”42的“person”可能占我文件的50%。我想要做的是在Elasticsearch的所有文件中获得所有id为42的人的平均“ranking.position.standard”。

{
"listing": [
    {
        "person": {
            "id": 42
        },
        "ranking": {
            "position": {
                "standard": 2
            }
        }
    },
    {
        "person": {
            "id": 55
        },
        "ranking": {
            "position": {
                "standard": 7
            }
        }
    }
]
}

感谢您的帮助!

1 个答案:

答案 0 :(得分:0)

首先,您是否将商家信息存储为objectnested数据类型?如果它是object,我认为它不会起作用,所以请尝试以下示例:

PUT /test
{
  "mappings": {
    "_default_": {
      "properties": {
        "listing": {
          "type": "nested"
        }
      }
    }
  }
}

PUT /test/aa/1
{
  "listing": [
    {
      "person": {
        "id": 42
      },
      "ranking": {
        "position": {
          "standard": 2
        }
      }
    },
    {
      "person": {
        "id": 55
      },
      "ranking": {
        "position": {
          "standard": 7
        }
      }
    }
  ]
}

PUT /test/aa/2
{
  "listing": [
    {
      "person": {
        "id": 42
      },
      "ranking": {
        "position": {
          "standard": 5
        }
      }
    },
    {
      "person": {
        "id": 55
      },
      "ranking": {
        "position": {
          "standard": 6
        }
      }
    }
  ]
}  

GET test/_search
{
  "size": 0,
  "aggs": {
    "nest": {
      "nested": {
        "path": "listing"
      },
      "aggs": {
        "persons": {
          "terms": {
            "field": "listing.person.id",
            "size": 10
          },
          "aggs": {
            "avg_standard": {
              "avg": {
                "field": "listing.ranking.position.standard"
              }
            }
          }
        }
      }
    }
  }
}

这给我带来了以下结果:

{
  "took": 1,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
  },
  "hits": {
    "total": 2,
    "max_score": 0,
    "hits": []
  },
  "aggregations": {
    "nest": {
      "doc_count": 4,
      "persons": {
        "doc_count_error_upper_bound": 0,
        "sum_other_doc_count": 0,
        "buckets": [
          {
            "key": 42,
            "doc_count": 2,
            "avg_standard": {
              "value": 3.5
            }
          },
          {
            "key": 55,
            "doc_count": 2,
            "avg_standard": {
              "value": 6.5
            }
          }
        ]
      }
    }
  }
}

这似乎是正确的。