新手BigQuery,SQL如何计算包含特殊行的组数

时间:2018-08-07 18:47:25

标签: sql firebase google-bigquery

我是新手

我使用firebase事件跟踪用户在网站上的活动。

作为一个简单的例子,我的“快乐”用例是用户完成步骤A,B,C。我要计算拥有“快乐”或“不快乐”的用户数。不开心的定义是,他们的会话不包含所有三个事件

这是带有一些简单数据的示例SQL。我能够计算快乐会话中“ C”事件的数量。我不知道如何识别不快乐的会话

WITH testData AS (
  SELECT "id1" as idd, "A" as name UNION ALL
  SELECT "id1" , "B"  UNION ALL
  SELECT "id1" , "C"  UNION ALL

  SELECT "id2" , "A"  UNION ALL
  SELECT "id2" , "B"  UNION ALL

  SELECT "id3" as id, "A" as name UNION ALL
  SELECT "id3" , "B"  UNION ALL
  SELECT "id3" , "C"  UNION ALL

  SELECT "id4" , "A"  UNION ALL
  SELECT "id4" , "B"  UNION ALL

  SELECT "id5" , "A"  UNION ALL
  SELECT "id5" , "B"  UNION ALL
  SELECT "id5" , "C"  UNION ALL
  SELECT "id5" , "A"  UNION ALL
  SELECT "id5" , "B"  UNION ALL
  SELECT "id5" , "C"  
)
SELECT * 
  FROM 
    (SELECT 
        idd, COUNT(name) as PASSED 
      FROM
        testData where name = "C"
      GROUP BY
        idd)

   UNION ALL

   (SELECT 
        idd, NUMERIC '0' as PASSED 
      FROM
        testData where name != "C"
      GROUP BY
        idd)
  ORDER BY
    idd

Row idd PASSED   
1   id1 1    
2   id1 0    
3   id2 0    
4   id3 1    
5   id3 0    
6   id4 0    
7   id5 2    
8   id5 0    

我期望结果像

Row idd PASSED   
1   id1 1    
3   id2 0    
4   id3 1    
6   id4 0    
7   id5 2    

任何建议将不胜感激。

还有谁能建议一个非常好的高级SQL教程?

安迪

4 个答案:

答案 0 :(得分:2)

您可以使用聚合。假设只允许这三种状态,这是一种方法:

select idd,
       (CASE WHEN count(distinct name) = 3 THEN 'Happy' else 'Unhappy' end) as state_of_mind
from testData
group by idd;

如果可以存在其他状态,则:

select idd,
       (CASE WHEN count(distinct case when name in ('A', 'B', 'C') THEN name END) = 3 THEN 'Happy' else 'Unhappy' end) as state_of_mind
from testData
group by idd

答案 1 :(得分:1)

根据您问题中的代码-您仅依靠步骤C(如果到达步骤C的唯一方法是先完成步骤A然后再执行步骤B,那么这才有意义)
因此,我在您的原始查询中遵循了这个想法,并对其进行了修正

#standardSQL
SELECT idd, COUNTIF(name = "C") Passed
FROM testData 
GROUP BY idd
-- ORDER BY idd

如果要应用于您的问题中的虚拟数据-结果将与预期一样

Row     idd     Passed   
1       id1     1    
2       id2     0    
3       id3     1    
4       id4     0    
5       id5     2    

答案 2 :(得分:0)

  

还有谁能建议一个非常好的高级SQL教程?

这些帮助了我:-)
教程:
http://www.sql-tutorial.ru/
练习(带有示例数据库和检查器):
http://www.sql-ex.ru/

答案 3 :(得分:0)

谢谢

戈登·利诺夫(Gordon Linoff)上面建议的窍门是使用“ CASE”,这是我以前从未见过的

这是我最终的解决方案

WITH testData AS (
-- happy 
  SELECT "id1" as idd, "A" as name UNION ALL
  SELECT "id1" , "B"  UNION ALL
  SELECT "id1" , "C"  UNION ALL

-- not happy
  SELECT "id2" , "A"  UNION ALL
  SELECT "id2" , "B"  UNION ALL

-- happy
  SELECT "id3" , "A"  UNION ALL
  SELECT "id3" , "B"  UNION ALL
  SELECT "id3" , "C"  UNION ALL

-- not happy  
  SELECT "id4" , "A"  UNION ALL
  SELECT "id4" , "B"  UNION ALL

-- happy
  SELECT "id5" , "A"  UNION ALL
  SELECT "id5" , "B"  UNION ALL
  SELECT "id5" , "C"  UNION ALL
  SELECT "id5" , "A"  UNION ALL
  SELECT "id5" , "B"  UNION ALL
  SELECT "id5" , "C"  
  )
  ,

  isHappyTable AS (
    SELECT
    idd,
    CASE
      WHEN name in ('C') 
        THEN NUMERIC '1'
      ELSE
        NUMERIC '0'
    END as isHappy
    FROM 
      testData

   ORDER BY
        idd
        )

SELECT 
    idd, 
    SUM(isHappy) AS isHappy
  FROM
    isHappyTable
  GROUP BY
    idd
  ORDER BY
    idd

    Row idd isHappy  
1   id1 1    
2   id2 0    
3   id3 1    
4   id4 0    
5   id5 2