BigQuery GroupBy with STRUCT

时间:2017-12-11 18:29:11

标签: sql google-bigquery

在BigQuery中,我可以使用标准SQL成功运行以下查询:

SELECT 
  COUNT(*) AS totalCount,
  city,
  DATE_TRUNC(timeInterval.intervalStart, YEAR) AS start
FROM 
  sandbox.CountByCity
GROUP BY 
    city, start

但是当我将start值嵌套在STRUCT中时失败了,就像这样......

SELECT 
  COUNT(*) AS totalCount,
  city,
  STRUCT(
    DATE_TRUNC(timeInterval.intervalStart, YEAR) AS start
  ) as timeSpan
FROM 
  sandbox.CountByCity
GROUP BY 
    city, timeSpan.start

在这种情况下,我收到以下错误消息:

  

在[10:11]

的SELECT列表别名timeSpan中不能引用GROUP BY字段

编写查询的正确方法是什么,以便start值嵌套在STRUCT中?

2 个答案:

答案 0 :(得分:4)

您可以使用ANY_VALUE执行此操作。您获得的结构值是明确定义的,因为整个组的值是相同的:

SELECT 
  COUNT(*) AS totalCount,
  city,
  ANY_VALUE(STRUCT(
    DATE_TRUNC(timeInterval.intervalStart, YEAR) AS start
  )) as timeSpan
FROM 
  sandbox.CountByCity
GROUP BY 
    city, DATE_TRUNC(timeInterval.intervalStart, YEAR);

以下是使用一些示例数据的示例:

WITH `sandbox.CountByCity` AS (
  SELECT 'Seattle' AS city, STRUCT(DATE '2017-12-11' AS intervalStart) AS timeInterval UNION ALL
  SELECT 'Seattle', STRUCT(DATE '2016-11-10' AS intervalStart) UNION ALL
  SELECT 'Seattle', STRUCT(DATE '2017-03-24' AS intervalStart) UNION ALL
  SELECT 'Kirkland', STRUCT(DATE '2017-02-01' AS intervalStart)
)
SELECT 
  COUNT(*) AS totalCount,
  city,
  ANY_VALUE(STRUCT(
    DATE_TRUNC(timeInterval.intervalStart, YEAR) AS start
  )) as timeSpan
FROM 
  `sandbox.CountByCity`
GROUP BY 
    city, DATE_TRUNC(timeInterval.intervalStart, YEAR);

您还可以考虑submitting a feature requestGROUP BY类型启用STRUCT

答案 1 :(得分:0)

不确定为什么你会想要这个 - 但是相信它是出于某种原因 - 所以请尝试以下(至少正式地按你的要求做)

  
starts_with()