Elasticsearch中对应的SQL聚合查询

时间:2019-05-06 19:36:02

标签: elasticsearch aggregate-functions elasticsearch-aggregation elasticsearch-sql

我研究了Elasticsearch聚合查询,但是找不到它是否支持多重聚合功能。换句话说,我想知道elasticsearch是否可以生成此Sql聚合查询的等效项:

  SELECT account_no, transaction_type, count(account_no), sum(amount), max(amount) FROM index_name GROUP BY account_no, transaction_type Having count(account_no) > 10

如果是,怎么办? 谢谢。

1 个答案:

答案 0 :(得分:4)

有两种可能的方法来完成您在ES中寻找的工作,我在下面都提到了它们。

我还添加了示例映射和示例文档供您参考。

映射:

PUT index_name
{
  "mappings": {
    "mydocs":{
      "properties":{
        "account_no":{
          "type": "keyword"
        },
        "transaction_type":{
          "type": "keyword"
        },
        "amount":{
          "type":"double"
        }
      }
    }
  }
}

样本文档:

请注意,我只为1个客户创建4个交易的列表。

POST index_name/mydocs/1
{
  "account_no": "1011",
  "transaction_type":"credit",
  "amount": 200
}

POST index_name/mydocs/2
{
  "account_no": "1011",
  "transaction_type":"credit",
  "amount": 400
}

POST index_name/mydocs/3
{
  "account_no": "1011",
  "transaction_type":"cheque",
  "amount": 100
}

POST index_name/mydocs/4
{
  "account_no": "1011",
  "transaction_type":"cheque",
  "amount": 100
}

有两种获取所需内容的方法:

解决方案1:使用Elasticsearch查询DSL

汇总查询:

对于聚合查询DSL,我利用以下聚合查询来解决您要查找的内容。

下面是查询的摘要版本,如何使您更清楚地了解哪些查询是同级,哪些查询是父级

- Terms Aggregation (For Every Account)
  - Terms Aggregation (For Every Transaction_type)
    - Sum Amount 
    - Max Amount

以下是实际查询:

POST index_name/_search
{
  "size": 0, 
  "aggs": {
    "account_no_agg": {
      "terms": {
        "field": "account_no"
      },
      "aggs": {
        "transaction_type_agg": {
          "terms": {
            "field": "transaction_type",
            "min_doc_count": 2
          },
          "aggs": {
            "sum_amount": {
              "sum": {
                "field": "amount"
              }
            },
            "max_amount":{
              "max": {
                "field": "amount"
              }
            }
          }
        }
      }
    }
  }
}

要提及的重要事项是 min_doc_count ,它只不过是having count(account_no)>10,在我的查询中,我仅过滤具有having count(account_no) > 2的那些交易

查询响应

{
  "took" : 5,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : 4,
    "max_score" : 0.0,
    "hits" : [ ]
  },
  "aggregations" : {
    "account_no_agg" : {
      "doc_count_error_upper_bound" : 0,
      "sum_other_doc_count" : 0,
      "buckets" : [
        {
          "key" : "1011",                         <----  account_no
          "doc_count" : 4,                        <----  count(account_no)
          "transaction_type_agg" : {
            "doc_count_error_upper_bound" : 0,
            "sum_other_doc_count" : 0,
            "buckets" : [
              {
                "key" : "cheque",                 <---- transaction_type
                "doc_count" : 2,
                "sum_amount" : {                  <----  sum(amount)
                  "value" : 200.0
                },
                "max_amount" : {                  <----  max(amount)
                  "value" : 100.0
                }
              },
              {
                "key" : "credit",                 <---- another transaction_type
                "doc_count" : 2,
                "sum_amount" : {                  <---- sum(amount)
                  "value" : 600.0
                },
                "max_amount" : {                  <---- max(amount)
                  "value" : 400.0
                }
              }
            ]
          }
        }
      ]
    }
  }
}

请仔细注意以上结果,我在需要的地方添加了注释,以帮助您查找sql​​查询的哪一部分。

解决方案2:使用Elasticsearch SQL(_xpack解决方案)

如果您正在使用Elasticsearch的SQL Access的xpack功能,则可以按如下所述简单地复制粘贴SELECT Query以获得上述映射和文档:

Elasticsearch SQL:

POST /_xpack/sql?format=txt
{
  "query": "SELECT account_no, transaction_type, sum(amount), max(amount), count(account_no) FROM index_name GROUP BY account_no, transaction_type HAVING count(account_no) > 1"

}

Elasticsearch SQL结果:

  account_no   |transaction_type|  SUM(amount)  |  MAX(amount)  |COUNT(account_no)
---------------+----------------+---------------+---------------+-----------------
1011           |cheque          |200.0          |100.0          |2                
1011           |credit          |600.0          |400.0          |2                

请注意,我已经在ES 6.5.4中测试了查询。

希望这会有所帮助!