德鲁伊-使用groupBy查询的降级时间戳

时间:2018-08-14 14:03:27

标签: database large-data druid

我要的内容应该非常简单,但是Druid文档对此几乎没有信息。

我正在进行groupBy查询,数据非常大,因此我通过在每个后续查询中增加limitSpec.limit来“分页”。

默认情况下,返回的数组从起始时间戳开始,并随时间向前移动。我希望结果从结束时间戳开始,然后从结束时间戳向后移。

有人知道怎么做吗?

因此,换句话说,默认情况下,groupBy查询看起来像这样:

[ 
  {
    "version" : "v1",
    "timestamp" : "2012-01-01T00:00:00.000Z",
    "event" : {
      "total_usage" : <some_value_one>
    }
  }, 
  {
    "version" : "v1",
    "timestamp" : "2012-01-02T00:00:00.000Z",
    "event" : {
      "total_usage" : <some_value_two>
    }
  }
]

我希望它看起来像这样:

[ 
  {
    "version" : "v1",
    "timestamp" : "2012-01-02T00:00:00.000Z",
    "event" : {
      "total_usage" : <some_value_two>
    }
  }, 
  {
    "version" : "v1",
    "timestamp" : "2012-01-01T00:00:00.000Z",
    "event" : {
      "total_usage" : <some_value_one>
    }
  }
]

2 个答案:

答案 0 :(得分:0)

您可以通过使用极限规格中的“列”属性来实现排序。参见下面的示例。

{
    "type"    : "default",
    "limit"   : <integer_value>,
    "columns" : [list of OrderByColumnSpec],
}

有关更多详细信息,请参阅以下druid文档- http://druid.io/docs/latest/querying/limitspec.html

答案 1 :(得分:0)

您可以将时间戳记添加为维度,但是将其截断为最新日期(假设您在查询中使用day粒度),并强制Druid首先按维度值然后按时间戳记对结果进行排序。

查询示例:

{
  "dataSource": "your_datasource",
  "queryType": "groupBy",
  "dimensions": [
    {
      "type": "default",
      "dimension": "some_dimension_in",
      "outputName": "some_dimension_out",
      "outputType": "STRING"
    },
    {
      "type": "extraction",
      "dimension": "__time",
      "outputName": "__timestamp",
      "extractionFn": {
        "type": "timeFormat",
        "format" : "yyyy-MM-dd"
      }
    }
  ],
  "aggregations": [
    {
      "type": "doubleSum",
      "name": "some_metric",
      "fieldName": "some_metric_field"
    }
  ],
  "limitSpec": {
    "type": "default",
    "limit": 1000,
    "columns": [
      {
        "dimension": "__timestamp",
        "direction": "descending",
        "dimensionOrder": "numeric"
      },
      {
        "dimension": "some_metric",
        "direction": "descending",
        "dimensionOrder": "numeric"
      }
    ]
  },
  "intervals": [
    "2019-09-01/2019-10-01"
  ],
  "granularity": "day",
  "context": {
    "sortByDimsFirst": "true"
  }
}