如何在BigQuery中的同一查询中获取不同条件下的多个计数?

时间:2018-07-27 19:23:27

标签: google-bigquery

我想在同一查询中获得多个计数,但条件不同。

  • 选择EXACT_COUNT_DISTINCT (columnA) as TotalCounts
  • 选择EXACT_COUNT_DISTINCT (columnA) where sum(columnB + ColumnC) > 0 as Engaged
  • 选择EXACT_COUNT_DISTINCT(columnA) where sum(columnB + ColumnC) = 0 as NonEngaged

从理论上讲,在我的情况下,Engaged + NonEngaged等于TotalCounts

1 个答案:

答案 0 :(得分:2)

以下内容适用于BigQuery标准SQL (确保您在下面的脚本中保留第一行,即使您未在UI或您使用的任何工具,库,api中明确设置此标准,也​​保留了设置要使用的标准SQL)

  
#standardSQL
SELECT 
  COUNT(DISTINCT columnA) AS TotalCounts,
  COUNT(DISTINCT IF(flag , columnA, NULL)) AS Engaged,
  COUNT(DISTINCT IF(NOT flag , columnA, NULL)) AS NonEngaged
FROM (  
  SELECT columnA, SUM(columnB + ColumnC) > 0 AS flag
  FROM `project.dataset.table`
  GROUP BY columnA
)
  

更新以解决注释中的其他问题-如果我们需要在同一查询中从两个单独的表中获取计数,该怎么办?假设上述同一查询中的第4列应该从tableXYZ获得columnY的不同计数

最简单的选择如下。交叉连接在这里可以,因为每个结果只有一行

#standardSQL
SELECT 
  TotalCounts,
  Engaged,
  NonEngaged,
  distinctY
FROM (
  SELECT 
    COUNT(DISTINCT columnA) AS TotalCounts,
    COUNT(DISTINCT IF(flag , columnA, NULL)) AS Engaged,
    COUNT(DISTINCT IF(NOT flag , columnA, NULL)) AS NonEngaged 
  FROM (  
    SELECT columnA, SUM(columnB + ColumnC) > 0 AS flag
    FROM `project.dataset.table`
    GROUP BY columnA
  )
)
CROSS JOIN (
  SELECT COUNT(DISTINCT columnY) distinctY
  FROM `project.dataset.tableXYZ`
)