将至少有一列具有true值的行分组

时间:2018-01-11 18:50:49

标签: group-by google-bigquery standard-sql

我有一张这样的表

+-----+-------------------+------+-------+-------+-------+---+
| Row |       email       | year | month | flag1 | flag2 |   |
+-----+-------------------+------+-------+-------+-------+---+
|   1 | user1@example.com | 2018 |     1 | true  | true  |   |
|   2 | user1@example.com | 2018 |     1 | false | true  |   |
|   3 | user1@example.com | 2018 |     1 | true  | true  |   |
|   4 | user2@example.com | 2018 |     1 | false | false |   |
|   5 | user2@example.com | 2018 |     1 | false | false |   |
|   6 | user2@example.com | 2018 |     1 | false | false |   |
|   7 | user3@example.com | 2018 |     1 | true  | false |   |
|   8 | user3@example.com | 2018 |     1 | true  | false |   |
|   9 | user3@example.com | 2018 |     1 | false | false |   |
+-----+-------------------+------+-------+-------+-------+---+

可以使用此声明生成

#standardSQL
WITH table AS (
  SELECT "user1@example.com" as email, 2018 as year, 1 as month, TRUE AS flag1, TRUE as flag2
  UNION ALL
  SELECT "user1@example.com",2018,1,FALSE,TRUE
  UNION ALL
  SELECT "user1@example.com",2018,1,TRUE,TRUE
  UNION ALL
  SELECT "user2@example.com",2018,1,FALSE,FALSE
  UNION ALL
  SELECT "user2@example.com",2018,1,FALSE,FALSE
  UNION ALL
  SELECT "user2@example.com",2018,1,FALSE,FALSE
  UNION ALL
  SELECT "user3@example.com",2018,1,TRUE,FALSE
  UNION ALL
  SELECT "user3@example.com",2018,1,TRUE,FALSE
  UNION ALL
  SELECT "user3@example.com",2018,1,FALSE,FALSE
)

emailyearmonth分组,输出表需要true值(对于两个flag列中的每一列),如果在分组数据中,至少有一行true

结果表应该是这个

+-----+-------------------+------+-------+-------+-------+---+
| Row |       email       | year | month | flag1 | flag2 |   |
+-----+-------------------+------+-------+-------+-------+---+
|   1 | user1@example.com | 2018 |     1 | true  | true  |   |
|   2 | user2@example.com | 2018 |     1 | false | false |   |
|   3 | user3@example.com | 2018 |     1 | true  | false |   |
+-----+-------------------+------+-------+-------+-------+---+

我开始按前3列对所有标记进行分组,但现在我已经确定是否确定每个数组中是否至少有一个true

SELECT email,
  year,
  month,
  ARRAY_AGG(flag1) as flag1,
  ARRAY_AGG(flag2) as flag2
FROM table
GROUP BY 1,2,3

1 个答案:

答案 0 :(得分:1)

#standardSQL
SELECT email,
  year,
  month,
  LOGICAL_OR(flag1) AS flag1,
  LOGICAL_OR(flag2) AS flag2
FROM table
GROUP BY 1,2,3