我想减少大查询的SQL查询

时间:2018-12-03 11:07:00

标签: sql google-bigquery query-optimization

我想从bigQuery数据库中获取数据,但出现错误 =>查询太大。查询的最大长度为256.000K个字符,包括注释和空格字符。 我将显示查询的一部分,我重复了21次

WITH data AS 
(
 SELECT
 IFNULL(department, 'UNKNOWN_DEPARTMENT') AS dept,


> 'C7s'

 AS campus,
 COUNTIF(task.taskRaised.raisedAt.milliSeconds BETWEEN 1542565800000 AND 1543170599999) AS taskCount_0,
 COUNTIF(task.taskRaised.raisedAt.milliSeconds BETWEEN 1542565800000 AND 1543170599999 

 AND IF (task.deadline.currentEscalationLevel NOT IN 
 (
 'ESC_ACKNOWLEDGEMENT'
 )
, task.deadline.currentEscalationLevel, 'NOT_ESCALATED') NOT IN 
 (
 'NOT_ESCALATED'
 )
) AS escCount_0,
 COUNTIF(task.taskRaised.raisedAt.milliSeconds BETWEEN 1541961000000 AND 1542565799999) AS taskCount_1,
 COUNTIF(task.taskRaised.raisedAt.milliSeconds BETWEEN 1541961000000 AND 1542565799999 
 AND IF (task.deadline.currentEscalationLevel NOT IN 
 (
 'ESC_ACKNOWLEDGEMENT'
 )
, task.deadline.currentEscalationLevel, 'NOT_ESCALATED') NOT IN 
 (
 'NOT_ESCALATED'
 )
) AS escCount_1,
 COUNTIF(task.taskRaised.raisedAt.milliSeconds BETWEEN 1541356200000 AND 1541960999999) AS taskCount_2,
 COUNTIF(task.taskRaised.raisedAt.milliSeconds BETWEEN 1541356200000 AND 1541960999999 
 AND IF (task.deadline.currentEscalationLevel NOT IN 
 (
 'ESC_ACKNOWLEDGEMENT'
 )
, task.deadline.currentEscalationLevel, 'NOT_ESCALATED') NOT IN 
 (
 'NOT_ESCALATED'
 )
) AS escCount_2 
 FROM

>  `nsimplbigquery.TaskManagement.C7s_*`

 WHERE
 _TABLE_SUFFIX IN 
 (
 '2018_47_11',
 '2018_45_11',
 '2018_46_11'
 )
 AND IFNULL(department, 'UNKNOWN_DEPARTMENT') IN 
 (
 'ENGG_AND_MAINT_DEPARTMENT',
 'FNB_DEPARTMENT',
 'TELECOM_DEPARTMENT',
 'IT_DEPARTMENT',
 'BILLING_AND_INSURANCE',
 'HOUSEKEEPING_DEPARTMENT'
 )
 AND task.taskRaised.raisedAt.milliSeconds BETWEEN 1541356200000 AND 1543170599999 
 GROUP BY
 dept
)
,
mainQuery AS 
(
 SELECT
 dept,
 campus,
 SUM(taskCount_0) AS taskCount_0,
 SUM(escCount_0) AS escCount_0,
 CAST(SAFE_DIVIDE(SUM(escCount_0), SUM(taskCount_0)) * 10000 AS INT64) AS escPerc_0,
 SUM(taskCount_1) AS taskCount_1,
 SUM(escCount_1) AS escCount_1,
 CAST(SAFE_DIVIDE(SUM(escCount_1), SUM(taskCount_1)) * 10000 AS INT64) AS escPerc_1,
 SUM(taskCount_2) AS taskCount_2,
 SUM(escCount_2) AS escCount_2,
 CAST(SAFE_DIVIDE(SUM(escCount_2), SUM(taskCount_2)) * 10000 AS INT64) AS escPerc_2 
 FROM
 data 
 GROUP BY
 ROLLUP (campus, dept)
)
SELECT
 dept,
 campus,
 taskCount_0,
 escCount_0,
 escPerc_0,
 taskCount_1,
 escCount_1,
 escPerc_1,
 taskCount_2,
 escCount_2,
 escPerc_2 
FROM
 mainQuery 
WHERE
 campus IS NOT NULL 
ORDER BY
 CASE
 WHEN
 dept IS NULL 
 THEN
 1 
 ELSE
 0 
 END
 ASC, dept ASC, campus ASC;

这是我重复了很多次的查询,因为我有很多ids我用以下ids更改了C7s

C7z, C7u, H0B, IDp, ITR, C7i C7j, C7k, C7l C7m C7o C71, C7t F6qZ, C7w, GIui, Fs C70, C7p C7r 如果您看到我的解释,请在此nsimplbigquery.TaskManagement.C7s_ *中引用一行 所以在下一个查询中,表名被更改了 喜欢

  

nsimplbigquery.TaskManagement.C7z _ *

1 个答案:

答案 0 :(得分:0)

与其重复整个SELECT语句21次,不如使用下面的方法。 _TABLE_SUFFIX的列表中将包含3x21 = 63个条目-但您可以解决查询长度问题

FROM `nsimplbigquery.TaskManagement.*` 
WHERE _TABLE_SUFFIX IN (
  'C7s_2018_47_11',
  'C7s_2018_45_11',
  'C7s_2018_46_11',
  'C7z_2018_47_11',
  'C7z_2018_45_11',
  'C7z_2018_46_11',
  'C7u_2018_47_11',
  'C7u_2018_45_11',
  'C7u_2018_46_11',
  ...
  ...
  ... 
  'C7r_2018_47_11',
  'C7r_2018_45_11',
  'C7r_2018_46_11',
  )