如何在复杂查询中使用_table_suffix限制数据集?

时间:2017-05-02 14:20:16

标签: sql google-bigquery

我理解_TABLE_SUFFIX如何工作并且在更简单的查询之前成功使用它。我目前正在尝试构建一个应用程序,该应用程序将从100多个数据集中获取活跃用户,但已经遇到资源限制。为了绕过这些资源限制,我将循环并多次运行查询,并使用_TABLE_SUFFIX限制一次选择的数量。

这是我当前的查询:

WITH allTables AS (SELECT
  app,
  date,
  SUM(CASE WHEN period = 30  THEN users END) as days_30
FROM (
  SELECT
    CONCAT(user_dim.app_info.app_id, ':', user_dim.app_info.app_platform) as app,
    dates.date as date,
    periods.period as period,
    COUNT(DISTINCT user_dim.app_info.app_instance_id) as users
  FROM `table.app_events_*` as activity
    WHERE _TABLE_SUFFIX BETWEEN '20170101' AND '20170502'
    OR _TABLE_SUFFIX BETWEEN 'intraday_20170101' AND 'intraday_20170502'
  CROSS JOIN
    UNNEST(event_dim) AS event
  CROSS JOIN (
    SELECT DISTINCT
      TIMESTAMP_TRUNC(TIMESTAMP_MICROS(event.timestamp_micros), DAY, 'UTC') as date
    FROM `table.app_events_*`
    WHERE _TABLE_SUFFIX BETWEEN '20170101' AND '20170502'
    OR _TABLE_SUFFIX BETWEEN 'intraday_20170101' AND 'intraday_20170502'

    CROSS JOIN
        UNNEST(event_dim) as event) as dates
    CROSS JOIN (
      SELECT
        period
      FROM (
        SELECT 30 as period
      )
    ) as periods
    WHERE
      dates.date >= TIMESTAMP_TRUNC(TIMESTAMP_MICROS(event.timestamp_micros), DAY, 'UTC')
    AND
      FLOOR(TIMESTAMP_DIFF(dates.date, TIMESTAMP_MICROS(event.timestamp_micros), DAY)/periods.period) = 0
    GROUP BY 1,2,3
  )
  GROUP BY 1,2) 
SELECT
 app as target,
 UNIX_SECONDS(date) as datapoint_time,
 SUM(days_30) as datapoint_value
FROM allTables
WHERE date >= TIMESTAMP_ADD(TIMESTAMP_TRUNC(CURRENT_TIMESTAMP, Day, 'UTC'), INTERVAL -30 DAY)
GROUP BY date,1
ORDER BY date ASC

目前这给了我:

  

错误:语法错误:预期“)”但在[14:3]

获得关键字CROSS

所以我的问题是,如何使用此查询和_TABLE_SUFFIX限制我提取的数据量?我觉得我在这里错过了很简单的东西。任何帮助都会很棒,谢谢!

1 个答案:

答案 0 :(得分:2)

CROSS JOIN UNNEST(event_dim) AS event(以及它后面的交叉连接)需要在WHERE子句之前。您可以在query syntax documentation中阅读更多内容。