如何将用户分组为A,B或两者

时间:2015-06-26 22:26:40

标签: sql google-bigquery

如果我有这样的数据:

user + tag
-----|-----
bob  |  A
bob  |  A
bob  |  B
tom  |  A
tom  |  A
amy  |  B
amy  |  B
jen  |  A
jen  |  A

对于数百万用户,我想知道有多少用户拥有标签A,B和两者。这是'我'坚持的'两个'案例。

在这种情况下,答案是:

Both: 1
A only: 2
B only: 1

我不需要返回用户ID,只需要返回计数。我正在使用BigQuery。

2 个答案:

答案 0 :(得分:3)

以下是使用SOMEEVERY函数的一种解决方案:

SELECT
  SUM(category == 'both') AS both_count,
  SUM(category == 'A') AS a_count,
  SUM(category == 'B') AS b_count
FROM (
  SELECT
    name,
    CASE WHEN SOME(tag == 'A') AND SOME(tag == 'B') THEN 'both' 
         WHEN EVERY(tag == 'A') THEN 'A' 
         WHEN EVERY(tag == 'B') THEN 'B'
         ELSE 'none' END AS category
  FROM 
    (SELECT 'bob' as name, 'A' as tag),
    (SELECT 'bob' as name, 'A' as tag),
    (SELECT 'bob' as name, 'B' as tag),
    (SELECT 'tom' as name, 'A' as tag),
    (SELECT 'tom' as name, 'A' as tag),
    (SELECT 'amy' as name, 'B' as tag),
    (SELECT 'amy' as name, 'B' as tag),
    (SELECT 'jen' as name, 'A' as tag),
    (SELECT 'jen' as name, 'A' as tag)
  GROUP BY name)

答案 1 :(得分:0)

我不知道google bigquery的语法,但这里有一个基于sql的问题解决方案。

    select a.tag_desc, count(distinct a.user) as total
    from (
    select coalesce(tA.user,tB.user) as user
      , tA.tag
      , tB.tag
      , case 
          when tA.tag is not null and tB.tag is not null then 'Both'
          when tA.tag is not null and tB.tag is null then 'A Only'
          when tA.tag is null and tB.tag is not null then 'B Only'
        end as tag_desc
    from table tA
      full outer join table tB
        on tA.user = tB.user
        and tB.tag = B
    where tA.tag = 'A'
    ) a

有一个子查询将数据集连接回自身,并带有完整的外连接。这将允许您一起评估两个条件(A和B)。有一个案例陈述来定义三个结果。在外部查询中,我计算每个case语句结果的用户。