BigQuery-将元素从数组添加到结构数组

时间:2019-02-27 12:01:56

标签: sql google-bigquery

我有一个看起来像这样的结构:

{"event": {
    "timestamp": [        
        "2019-01-13 17:21:08.570140 UTC",
        "2019-01-14 14:10:55.475515 UTC",
        "2019-01-09 14:02:51.848917 UTC"
    ],
    "properties": [
        {"device_model": "iPhone", "country": "United Kingdom"},
        {"device_model": "Android", "country": "United States"},
        {"device_model": "iPhone", "country": "Sweden"}
    ]
}

我想实现这一目标:这样每个时间戳都输入相应的结构。

{"event": [
        {"timestamp": "2019-01-13 17:21:08.570140 UTC","device_model": 
         "iPhone", "country": "United Kingdom"},
        {"timestamp": "2019-01-14 14:10:55.475515 UTC", "device_model": 
         "Android", "country": "United States"},
        {"timestamp": "2019-01-09 14:02:51.848917 UTC", "device_model": 
         "iPhone", "country": "Sweden"}
    ]
}

我是通过这样的查询创建当前结构的:

WITH
  events AS (
  SELECT
    "customer_1" AS customer_id,
    "timestamp_1" AS timestamp,
    STRUCT("iphone" AS device_model,
      "uk" AS country ) AS properties
  UNION ALL
  SELECT
    "customer_2" AS customer_id,
    "timestamp_2" AS timestamp,
    STRUCT("android" AS device_model,
      "us" AS country) AS properties
  UNION ALL
  SELECT
    "customer_2" AS customer_id,
    "timestamp_3" AS timestamp,
    STRUCT("iphone" AS device_model,
      "sweden" AS country) AS properties )
SELECT
  customer_id,
  STRUCT(ARRAY_AGG(timestamp) AS timestamp,
    ARRAY_AGG(properties) AS properties) AS event
FROM
  events
GROUP BY
  customer_id

如何修改查询以实现所需的结构?

---编辑

我可以这样做,但是这需要在生成查询时了解属性的架构-这是可能的,但不是很漂亮。有没有更简单的方法?

WITH
  events AS (
  SELECT
    "customer_1" AS customer_id,
    "timestamp_1" AS timestamp,
    STRUCT("iphone" AS device_model,
      "uk" AS country ) AS properties
  UNION ALL
  SELECT
    "customer_2" AS customer_id,
    "timestamp_2" AS timestamp,
    STRUCT("android" AS device_model,
      "us" AS country) AS properties
  UNION ALL
  SELECT
    "customer_2" AS customer_id,
    "timestamp_3" AS timestamp,
    STRUCT("iphone" AS device_model,
      "sweden" AS country) AS properties )
SELECT
  customer_id,
  ARRAY_AGG(properties) AS event
FROM (
  SELECT
    customer_id,
    struct(timestamp as timestamp, 
           properties.device_model as device_model, 
           properties.country as country) as properties
  FROM
    events)
GROUP BY
  customer_id

1 个答案:

答案 0 :(得分:1)

您可以利用SELECT AS STRUCT并使用properties作为选择器来做类似的事情。

SELECT
  customer_id,
  ARRAY_AGG(properties) AS prop
FROM (
  SELECT
    customer_id,
    (
    SELECT
      AS STRUCT timestamp,
      properties.*) AS properties
  FROM
    events e )
GROUP BY
  1

这将返回:

[
  {
    "customer_id": "customer_1",
    "prop": [
      {
        "timestamp": "timestamp_1",
        "device_model": "iphone",
        "country": "uk"
      }
    ]
  },
  {
    "customer_id": "customer_2",
    "prop": [
      {
        "timestamp": "timestamp_2",
        "device_model": "android",
        "country": "us"
      },
      {
        "timestamp": "timestamp_3",
        "device_model": "iphone",
        "country": "sweden"
      }
    ]
  }
]

您可以进一步写成这样:

  

选择AS STRUCT e。*,但(customer_id)